Federated learning for LLMs

About 3 years ago I wrote a blog about federated learning on AWS using the Flower framework. Recently I decided to update the sample code to work with a large language model (LLM) for Generative AI.

Why is federated learning interesting for LLMs? The drivers for federated learning are largely the same - the need to fine-tune using data from devices or sites with limited connectivity, and the need to keep data local.

In the time since I wrote the original blog, federated learning frameworks have made a lot of progress. I still like the Flower framework for the simplicity of the API. It's easy to put a proxy in front of the devices the communicate via MQTT asynchronously. Flower also integrates with HuggingFace to work with LLMs.

Compared to the original code, I made these changes:

I ran the Greengrass Lambda function without container isolation. I was running into memory problems when running the function in a container, so it was simpler to run without that isolation. Of course, updating to Greengrass v2 would be a better long-term solution.
I used a larger EBS volume on the simulated Greengrass core devices.
I updated to Python 3.8. Again, Greengrass v1 is limited to Python 3.8 or earlier.
I used the averaging method from the Flower sample.
I set the memory limits higher on the proxy containers.

The new code is in branch in the original repo. I've lightly tested it just to prove the concept.

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Site Terms, Privacy, and more.

Federated learning for LLMs

Federated training for GenAI

Comments