Migrating from Gmail to Microsoft o365

PROJECTS
Published: November 22, 2022
Hooray
Hooray

Back in August 2021, the team I work with celebrated a huge milestone. We migrated the last mailbox of some 24,000 plus users from Google's Gmail over to Microsoft Exchange Online (Microsoft o365)! It had been quite a bumpy journey, but the ride had finally come to an end. Let me tell you about it!

In 2018, we had a new CIO whose vision was to get everyone onto Microsoft o365. To be honest, a lot of the team were apprehensive, but as we were already paying for the licenses, it made sense. Being firmly set in our ways, we joked about rebranding Gmail with a Microsoft Outlook logo and leaving it there. Most of us had the massive upheaval at the back of our minds from when we migrated from on-prem mail servers to Gmail some 10+ years ago. That move had left some deep scars for those who had been working at the University at the time. After running a consultation period, and some very vocal opinions being shared, we were given the green light to migrate to o365.

From around November 2019 through to January 2020, a small number of the team became early adopters. We took the opportunity to road test o365 and figure out any pain points that would need addressing. Boy did we come across some challenging issues! In November 2019, Microsoft had released a new way to migrate between Gmail and Exchange Online, before, they used IMAP to sync mailboxes. All it took over were your emails and nothing else. However, the new method made use of Google's API to copy over more data. The new migration method included calendars which was a huge win for us as we do love our calendars here! At the time, I had around 5 different shared calendars used by my team. Unfortunately, as early adopters, we found the new migration method more hassle than it should have been.

On my first foray into Outlook Web Access, I quickly came across some major issues! All of a sudden I had a load of shared calendars that were no longer shared. Anyone else who came over also had the same standalone copies of shared calendars. Even worse, I couldn't delete them! We tested this over and over and over and found it to be a consistent issue. It did this with every calendar that Google classed you as an owner of. That is if you have the ability to edit sharing settings on a calendar, it classes you as an owner. This could have been a showstopper, luckily after some weeks, we figured out how to delete these calendars, it took even longer to stop them from becoming a problem for our 24,000 plus mailboxes. (For anyone seeing this issue, you need to drop calendar permissions down from owner before you migrate an account!) To top it off, the migration even copied over Google's Bank Holidays calendar, this didn't really make much sense as Microsoft have their own.

After a lot of trial and error we had calendars cracked. We decided the best way forward would be to let Microsoft take all mailbox's calendars over and then delete everything but their main calendar. Unfortunately, we noticed a number of other things that were a problem. Though the new migration method used Google's API, the only additional data it took over were calendars. It didn't do Email Signatures, Out of Office messages, email filters (we decided not to worry about these), mailbox delegates (or these), email address aliases or profile photos. We also found that email forwarding setup by the Exchange Migration tool didn't always work. This was looking pretty lack luster and just didn't fit the bill. We wanted the migration to go as smoothly as possible but the user experience was extremely lacking, needing them to do the majority of the work themselves.

In January 2020, I started to write an internal tool to tackle this.

Self-Service Migration Portal - Initial version of our internal migration tool
Self-Service Migration Portal - Initial version of our internal migration tool

The original intention was a self-service portal. The user would log in to both their Gmail and Microsoft accounts, then hit a button to pull over the bits that the Exchange Migration didn't do. I got it working, and it worked well, but it wasn't good enough. What about those off long-term? How would their stuff get transferred over? What if they missed the cut off we'd impose? So my tool transitioned from a user facing tool to an internal admin tool.

Dashboard of Internal Migration Tool
Dashboard of Internal Migration Tool

Built using Laravel (PHP) and VueJS, our internal migration tool had a web interface that allowed you to start various tasks based on AD Users or Groups.

User and Group Search
User and Group Search

It made use of a queue system to schedule work and multiple background workers to run through all the waiting jobs. The front-end allowed you to see the output of each task in near real-time and track everything that was going on. Via a combination of API calls and Powershell scripts, I put together tasks which covered all the bits missed by the exchange migration and even fixed some issues caused by it like what we'd called "Collateral Calendars". Just to add some sugar to the tool, I added email notification tasks so we could let end users know that they had been migrated using branded emails. I also built a job to set the users OWA timezone and language which can cause chaos for calendars if the user excepts the default of -12 UTC on first login! After a couple of months of development (and non-stop tweaking since!), we had a solution which tied everything together and gave the end user the experience they deserved.

New Migration Wizard
Migration Overview
Migration Tasks
Migration Users
Migration Jobs

After months of comms and hosting information sessions for our staff, in June 2020, we officially started migrating our staff mailboxes. The decision was made to do the migrations by faculty or directorate rather than big bang. We did these overnight to minimise disruption, moving anywhere upto 1,500 mailboxes a time. The migrations went fairly smoothly apart from a few snags with concurrency of tasks and hitting API rate limits. By the end of the summer 2020, we had migrated all of our staff mailboxes and set our sights on moving our students.

The student migration would take place in the summer break of 2021. This would be a lot simpler as these would be done on a rolling basis during the day. The problem we faced was where to start! How do we batch our students? By course? By date? By name? How big would the batches be? We wanted to minimise the amount of work we had to do, while giving the student a good experience with the migration. We had settled on 1,800 students per batch going alphabetically by surname. After some further discussion, and rather last minute, this was dropped to 600 per batch to minimise the time it took to complete each batch. This would allow us to run our internal tool a lot sooner than waiting for 1,800 to complete.

We gave ourselves a 2-week window to get all of our students across. About a week before the official start date, we generated our batches and loaded them into the exchange migration wizard. This started a mailbox sync for all of our students and got the majority of their data into exchange. Once the initial full sync took place, a delta was performed automatically once every 24 hours to minimise the amount of work needed when we told exchange to complete each batch.

By this point all of our students were technically on exchange, we just hadn't given them a license to be able to get into their new mailbox and there was also the potential for a mailbox to be upto 24 hours out of date.

Due to the rolling nature of this migration, we had mentioned as part of the comms that each student would get an email saying when their mailbox had started to move and when it was complete. This meant adding a new stage to my internal tool to accommodate pre-migration tasks such as sending out an email.

On the big day we started by sending out the pre-migration email to the first few batches. We then set those batches to complete within the exchange migration wizard. Once those batches had completed we kicked off the post-migration tasks within the internal tool, followed by the completion emails. One of the biggest time sinks with this was validating and dispatching all the required jobs into the queues. Unlike the rest of the tasks the validation and dispatch task couldn't handle concurrency. Due to the amount of time this process took, concurrency caused errors and migrations to be run multiple times. This meant it would take longer to generate the jobs than it took to complete them! At the end of day one, we had only completed a couple of batches. We'd spent most of the day waiting for my tool to do its thing. We needed to get a balance between the number of batches completing and the throughput my tool could cope with. We didn't want batches sat there completed but had to wait ages to be run through my tool. The throughput was limited by how quickly the tool could dispatch the jobs and by the Microsoft Graph API rate limits. Without realising it, I'd already given myself an answer to this problem. I'd initially added a scheduler to the system but never used it. This allowed tasks to be scheduled to run after a configurable time and date. Even though the tasks themselves would run at this time, the validation and dispatch was instant. Making use of this I could line up all of the remaining batches for all tasks and stages for some date in the future. This validated the migration and queued up all the jobs ready to be run. When each batch completed in the exchange migration wizard, I could go into the database and change the start timestamp for all jobs associated with a batch and they would start ticking through instantly, no waiting around for the tasks to be queued up!

After this useful change to the workflow, we flew through migration batches, sometimes completing 6 batches a day. We were done just shy of a week. Within 5 days we had migrated around 18,000 mail boxes from Gmail to Exchange Online, a huge feat!

After almost a 2-year journey and hundreds of hours of work (many of which in the early hours of the morning), the migration of over 24,000 mailboxes was complete. My migration tool had completed a staggering 322,000 individual migration tasks and generated over 6.8 million lines of logs!

If you are about to take this same journey, be warned, there be gremlins down that path! I would recommend you stick to the IMAP migration method and forget all of the added stuff like calendars as it was painful to get it right. Better yet, use a 3rd party migration service as they will completely bypass the brick walls we hit doing this ourselves and stop you from having to write your own tool.