Learning
Last updated
Last updated
The model can generalize based on data from a group of devices and thus instantly work with other compatible devices.
Data can explain all variations in the devices and their environment.
Connectivity - data must be transmitted over a stable connection.
Bandwidth - e.g. a new electrical substation could generate 5 GB/s.
Latency - real-time applications, e.g automation, require very low latency.
Privacy - sensitive operational data must remain on site.
ML that runs onsite, onboard each connected device, by continuously training the ML model on streaming data, the devices learn an individual model for their environment.
Each model only needs to be able to explain what is normal for itself and not how it varies compared to all other devices.
Models adapt to changes over time, learning is not constrained by the internet connection, and no confidential information needs to be transferred to the cloud.
It is not possible to get an overall view and learning.
ML technique to train algorithms across decentralized edge devices while holding data samples locally.
Google started as the main player.
Aim to train ML models on billions of mobile phones while respecting the privacy of the users.
Only sends fractions of training results:
E.g: training derivatives, to the could.
Never store anything on the device.
When collected in the cloud, the partial training results can be assembled into a new supermodel that, in the next step, can be sent back to the devices.
Google open source framework TensorFlow Federated.
Model inspection - evaluation of device behavior through its model.
Model comparison - comparing models in the cloud to find outliers, the super-models.
Robust learning - learning can continue even if the connection to the cloud is lost.
Tailored initialization - new devices can start with a model from a similar device, instead of a general super-model.
FL employs an iterative method containing multiple client-server exchanges: federated learning round.
Diffuse the current/updated global model state to the contributing nodes (participants).
Train the local nodes on those nodes to yield certain potential model updates from the nodes.
Process and aggregate the updates from local nodes into an aggregated global update so that the central model can be updated accordingly.
FL server is used for this processing and aggregation of local updates to global updates.
Local training is performed by local nodes with respect to the commands of FL server.
The FL approach allows for the mass processing of data in a distributed way.
It can follow a client-server architecture.
The server sends the model to be created to the client (1 - green lines).
Results of the local computation are sent to the server, which aggregates them into the global model (2 - blue lines).
Returns the new aggregated model to the clients (3 - red lines).
This iteration, named federated learning round (FLR), occurs until some stopping criterion is reached, such as model convergence or the maximum number of iterations reached.
Edge devices only send information from their local nodes (parameters, hyperparameters (before training), weights, etc).
Federated learning distributed deep learning by eliminating the necessity of pooling the data into a single place.
In FL, the model is trained at different sites in numerous iterations.
Effective aggregation of distributed models across devices is essential for creating a generalized global model.
Its efficiency affects precision, convergence time, number of rounds, and network overhead.
Federated stochastic gradient descent (SGD): uses a single instance of the dataset to perform the local training on the client per round of communication. SGD requires a substantial number of training rounds to produce reliable models. This algorithm is the baseline of federated learning.
FedAvg algorithm starts from the SGD, but each client locally performs a train using the local data at the current model with multiple steps of SGD before sending the models back to the server for aggregation.
FedAvg reduces the communication overhead required to upload and download the FL model.
It requires clients to perform more total computation during training.
Local epoch: one complete pass of the training dataset through the algorithm.
Fault Tolerant Federated Average: the ability of a computing system to continue working in the event of a failure.
Can tolerate some nodes being offline during secure aggregation.
Q-Federated Average: reweight the objective in order to achieve fairness in the global model.
Gives higher weights to devices with poor performance.
The network's accuracy distribution becomes more uniform.
Fk to the power of (q+1), q is a parameter that tunes the amount of fairness to impose.
Federated Optimization: uses a client optimizer during the multiple training epochs and a server optimizer during model aggregation.
ADAGRAD, ADAM, and Yogi
Open source framework for experimenting with machine learning and other computing on decentralized data.
Locally simulating decentralized computations into the hands of all TensorFlow users.
ML model architecture of our choice.
Train it locally across data by all users.
The version of the NIST dataset that has been processed by the Leaf project separates the digits written by each volunteer.
Training an ML model with federated learning is one example of federated computation.
Evaluating it over decentralized data is another.
An array of sensors capturing temperature readings.
Compute the average temperature across these sensors.
Each client computed its local contribution.
A centralized coordinator aggregates all the contributions.
Open source framework for experimenting with machine learning and other computations on decentralized data.
Able to use in containers in a federated framework, FedFramework.
List containers | http://10.0.22.37:8000/containers/list |
Server creation | http://10.0.22.37:8000/containers/create/server?img_server=fed- server&port=5010&id=10&clients=4&algorithm=FedAvg&model=cnn&rounds=10&epochs=5 &predict=true |
Rapid deployment of a testbed and running tests | http://10.0.22.37:8000/run?img_server=fed-server&img_client=fed- client&model=cnn&clients=4&rounds=10&epochs=5&predict=true&predict_client=true |
Domain | Applications |
---|---|
Edge computing | FL is implemented in edge systems using the MEC (mobile edge computing) and DRL (deep reinforcement learning) frameworks for anomaly and intrusion detection. |
Recommender systems | To learn the matrix, federated collaborative filter methods are built utilizing a stochastic gradient approach and secured matrix factorization using federated SGD. |
NLP | FL is applied in next-word prediction in mobile keyboards by adopting the FedAvg algorithm to learn CIFG. |
IoT | FL could be one way to handle data privacy concerns while still providing a reliable learning model. |
Mobile service | The predicting services are based on the training data coming from edge devices of the users, such as mobile devices. |
Biomedical | The volume of biomedical data is continually increasing. However, due to privacy and regulatory considerations, the capacity to evaluate these data is limited. By collectively building a global model for the prediction of brain age, the FL paradigm in the neuroimaging domain works effectively. |
Healthcare | Owkin and Intel are researching how FL could be leveraged to protect patients' data privacy while also using the data for better diagnosis. |
Autonomous industry | Another important reason to use FL is that it can potentially minimize latency. Federated learning may enable autonomous vehicles to behave more quickly and correctly, minimizing accidents and increasing safety. Furthermore, it can be used to predict traffic flow. |
Banking finance | The FL is applied in open banking and in finance for anti-financial crime processes, loan risk prediction, and the detection of financial crimes. |
Edge vehicles compute the model locally; after completing each local training epoch, they retrieve the global model version and compare it to their local version.
In order to form a global awareness of all local models, the central server performs aggregation based on the ratio determined by the global and local model versions.
The aggregation server returns the aggregated result to the edge vehicles that request the most recent model.
MEC-empowered model sharing.
Edge intelligence to wireless edge networks and enhances the connected intelligence among end devices in 6G networks.