We recently migrated from a self-managed Gitlab instance to gitlab.com. The system administrator of the self-managed Gitlab instance said this would simply entail a git pull && git push
and that the migration will be done quickly. Depending on your usage of Gitlab, this is either a naïve oversimplification or straight forward dangerous. The truth is more nuanced and entails quite a bit more work. Since the migration took us a couple of full working days and we wrote some reusable checklists and code in the process, we have quickly jotted these down. Maybe somebody else can also make use of it.
When moving off a self-managed instance, there's a couple high level planning questions to consider:
-
[ ] Are there multiple entities using Gitlab?
- In our case, we had three companies using the instance that each had multiple groups - additionally there's a couple people that had projects in their private accounts.
- When moving, they might have different strategies - at least you'll need different API keys. In our case, the move entailed even moving to different Forges.
-
[ ] How to coordinate the switch between different teams and people?
- Moving takes time and is not an atomic task. If you have multiple projects and teams, there's communication effort on top of planning and the actual work.
-
[ ] Inform your teams with enough time to plan ahead.
- A week is not enough unless there's nothing else on the table for that week.
- A team might have used a Gitlab feature that's not easy to migrate or you didn't think of migrating yourself.
- In any case, inform them with a strategic plan, otherwise coordination can become more difficult.
- Maybe you can use this blog post as a blueprint for your migration plans, at least.
After these, there's high level technical tasks:
-
[ ] Chose a new Forge.
- Moving off Gitlab to a different Forge will most likely lose a lot of metadata, make your automation with bots and Gitlab CI unusable, etc.
- Beware, however, that even moving to another Gitlab instance like gitlab.com will lose metadata and entail manual work.
- We chose to stick with Gitlab, because we have no reason to move off Gitlab. We're happy users and have been for years.
-
[ ] On gitlab.com, create new groups and user accounts.
- Make sure these have the correct rights set up.
- Maybe you can use this as a chance to do some cleanup.
-
[ ] Invite your teams to the new groups.
Now, to the actual migration tasks. Gitlab has the ability to export and import projects. This is possible to do in the Web Application. However, depending on the number of projects, this will be quite tedious and error prone. We opted to make use of various Gitlab CLI projects, but that didn't pan out. Having said that, using the Gitlab API directly is well documented and straight forward.
Now, on a high level, we'll do the following:
- Get the current projects lists together with
id
andname
. - Create a mapping between old and new project names.
- Trigger a job to export each project.
- Poll the self-managed Gitlab for the export until it can be downloaded.
- Import each project into gitlab.com.
Here are the details:
- Get the project list:
curl "https://gitlab.200ok.ch/api/v4/projects?private_token=$GITLAB_TOKEN&per_page=100" \
| jq '.[] | { id: .id, from: .path_with_namespace, to: ""}'
This will yield a structure like:
[
{ "id": 1, "from": "200ok/project-name", "to": "" }
]
For automating tasks 2-5, we have written this Ruby script:
require 'json'
projects = [
# This project will keep it's namespace and project name when
# imported.
{ "id": 1, "from": '200ok/project-name', "to": '' },
# This project will only be downloaded for archiving, but not
# imported to gitlab.com
{ "id": 2, "from": '200ok/project-name', "to": nil },
# This project will be imported to a different namespace and project
# name.
{ "id": 3, "from": '200ok/project-name', "to": "ns2/project-name-2" },
]
# prepare projects
projects.each_with_index do |project, index|
to = project[:to]
to = project[:from] if to and to.empty?
project[:to] = to
projects[index] = project
end
BASE_CMD = 'curl -s --header "PRIVATE-TOKEN: %s" '
EXPORT_CMD = BASE_CMD + '--request POST "https://gitlab.200ok.ch/api/v4/projects/%s/export"'
STATUS_CMD = BASE_CMD + '"https://gitlab.200ok.ch/api/v4/projects/%s/export"'
DOWNLOAD_CMD = BASE_CMD + ' --remote-header-name --remote-name "https://gitlab.200ok.ch/api/v4/projects/%s/export/download"'
IMPORT_CMD = BASE_CMD + '--request POST --form "namespace=%s" --form "path=%s" --form "file=@%s" "https://gitlab.com/api/v4/projects/import"'
tokens = {
old-gitlab-admin-token: 'token1',
new-gitlab-user-1: 'token2',
new-gitlab-user-2: 'token3'
}
# schedule exports from gitlab.200ok.ch
projects.each do |project|
puts "Requesting export for #{project['from']}..."
cmd = EXPORT_CMD % [tokens[:old-gitlab-admin-token], project[:id]]
system(cmd)
sleep 0.25
end
# loop to find finished exports and import to gitlab.com
remaining = [true]
while remaining.count
projects.each_with_index do |project, index|
file = project[:from].tr('/', '_') + '.tar.gz'
projects[index][:done] = done = File.exists?(file)
next if done
print "Checking #{project[:from]}..."
cmd = STATUS_CMD % [tokens[:old-gitlab-admin-token], project[:id]]
result = JSON.parse(%x[#{cmd}])
puts status = result['export_status']
if status == 'finished'
puts "Downloading #{project[:from]}..."
cmd = DOWNLOAD_CMD % [tokens[:old-gitlab-admin-token], project[:id]]
system(cmd)
system("mv *_export.tar.gz #{file}")
if to = project[:to]
token = to.start_with?('username1') ? tokens[:new-gitlab-user-2] : tokens[:new-gitlab-user-1]
puts "Uploading #{to}..."
ns, path = to.split('/')
cmd = IMPORT_CMD % [token, ns, path, file]
system(cmd)
end
end
# do not inundate gitlab
# 5 request per minute per user
sleep 15
end
remaining = projects.select { |project| !project[:done] }
puts "Remaining #{remaining.count}/#{projects.count}"
end
puts "All done."
Now your projects are on gitlab.com - you're done, right? Not quite:
-
All Merge requests, comments, assignments, etc will belong to the user whom the API key belongs to. Gitlab has no way of knowing which users to map these things to.
- [ ] Manually go over all projects and re-assign the relevant topics.
-
[ ] Webhooks are not included in the export/import process. If you're using Webhooks, you will have to reconfigure those for each project. You can probably use this API for it, but we did it by hand, because we took the chance to rewire some notifications.
-
[ ] CI/CD Variables are not included in the export/import process. If you're using CI/CD Variables, you will have to reconfigure those for each project. We did this by hand, but we wrote a script to make it more visible where CI/CD Variables are used:
require 'json'
projects = [
{ "id": 1, "from": '200ok/project-name', "to": '' }
]
projects.each do |project|
variables =
`curl --silent --header "Private-Token: #{ENV['GITLAB_API_TOKEN']}" "https://gitlab.200ok.ch/api/v4/projects/#{project[
:id
]}/variables"`
variables = JSON.parse(variables)
unless variables.empty?
puts "* #{project[:from]}"
puts "#+begin_src json"
puts JSON.pretty_generate(variables)
# puts `echo '#{variables.to_json}' | jq '.'`
puts "#+end_src json"
puts
end
end
This will yield an Org mode document like:
* TODO 200ok/200ok.ch
#+begin_src json
[
{
"variable_type": "env_var",
"key": "FTP_HOST",
"value": "your-ftp-host",
"protected": false,
"masked": false,
"environment_scope": "*"
}
]
#+end_src json
Then, make the adjustments in the relevant Gitlab projects.
-
[ ] Migrate Gitlab container registry.
- If you're using the Gitlab container registry, maybe even for CI runners, you need to migrate those images and all projects using them.
-
[ ] If you've used Bot users doing things on commit/push, you'll need to migrate those, their Docker images and config.
Now, you're done with the migration of projects from your self-managed Gitlab instance to gitlab.com. However, the work is not done, yet. You'll need to:
- [ ] Inform your team members to now update their git repository remotes and discontinue their use of the prior Forge.
- If they use tooling on top of git (like Magit Forge), they will have to update their remotes there, too.
Since it's time critical and error prone to make these adjustments by hand on many projects in a diverse team, we've written a script to automate that process, as well:
STDOUT.sync = true
require 'colorize'
require 'yaml'
require 'securerandom'
system 'stty cbreak'
base = ARGV.first
custom_file = File.expand_path('.fix_git_config.yml', ENV['HOME'])
custom = File.exist?(custom_file) ? YAML.load(File.read(custom_file)) : []
mapping = YAML.load(DATA.read).concat(custom)
repos = Dir.glob('**/.git/config', base: base)
repos.each_with_index do |config, index|
config = File.expand_path(config, base)
repo = config.sub('/.git/config', '')
puts ('-' * 60).colorize(:yellow)
puts "Repo #{index+1}/#{repos.count}: #{repo}".colorize(:yellow)
ini = File.read(config)
replacements = {}
ini_viz = mapping.reduce(ini) do |r, h|
uuid = SecureRandom.uuid
replacements[uuid] = h.keys.first.colorize(:red)
r.gsub(h.keys.first, uuid + h.values.first.colorize(:green))
end
ini_viz = replacements.reduce(ini_viz) { |r, kv| r.gsub(*kv) }
if ini != ini_viz
puts
puts ini_viz
puts
puts 'Type y to apply, n to skip, anything else to abort.'.colorize(:yellow)
q = $stdin.sysread 1
if q == 'y'
ini = mapping.reduce(ini) { |r, h| r.gsub(h.keys.first, h.values.first) }
File.open(config, 'w') { |f| f.write(ini) }
puts
puts "Updated #{config}"
elsif q == 'n'
puts
puts 'Skipped.'
else
puts
puts 'Abort.'
system 'stty cooked'
exit 0
end
else
puts 'No changes required.'
end
end
system 'stty cooked'
__END__
- gitlab.200ok.ch: gitlab.com
- 200ok/old-project-name: ns2/project-name-2
At the end of the script, there's a mapping between old namespaced project names and new ones. So, if you did this kind of cleanup with the first script, you can do it here, too.
The usage of this script is: ./fix_git_origin.rb path
. Running it looks like this:
Now, you're all set!
It took us close to 4 working days worth of work to migrate 75 projects in different teams. With this writeup, we hope you will get it done faster!
If you liked this post and want to say 'thanks', please head over to our free/libre and open source software page - and if you like one of them, give it a star on Github or Gitlab.