NVIDIA Shows How To Build AI Models At Scale With PyTorch Lightning

In its latest blog post, NVIDIA researchers showed how to build speech models with PyTorch Lighting on CPU-powered AWS instances (Grid). PyTorch Lightning is a lightweight PyTorch wrapper designed to make high performance AI research simple. It is an organised PyTorch, which allows users to train their models on CPU, GPUs, or multiple nodes without changing any code.

Grid, which runs on AWS, supports Lightning and classic machine learning frameworks such as TensorFlow, Keras, PyTorch, Sci-Kit, and others. It also helps users to scale the training of models from the NGC catalogue. NGC catalogue is a curated set of GPU-optimised containers for deep learning, visualisation, and high-performance computing (HPC).

PyTorch lightning software and developer environment is available on NGC Catalog. Also, check out GitHub to get started with Grid, NGC, PyTorch Lightning here.

Training AI Models

For building speech models, NVIDIA researchers have used ASR, which transcribes spoken language to text. ASR is a critical component of speech-to-text systems. So, when training ASR models, the goal is to generate text from a given audio input that reduces the word error rate (WER) metric on human transcribed speech. The NGC catalogue contains SOTA pretrained models of ASR.

Further, they use Grid sessions, NVIDIA NeMo, and PyTorch Lightning to fine-tune these models on the AN4 dataset, aka Alphanumeric dataset. Collected and published by Carnegie Mellon University, the AN4 dataset consists of recordings of people spelling out addresses, names, phone numbers, etc.

Here are the key steps to follow when building speech models:

Create a Grid session optimised for Lightning and pretrained NGC models
Clone the ASR demo repo and open the tutorial notebook
Install NeMo ASR dependencies
Convert and visualise the AN4 dataset (Spectrograms and Mel spectrograms)
Load and inference a pre-trained QuartzNet model from NGC
Fine-tune model with Lightning
Inference and deployment
Pause session

Create a Grid Session

Users can run Grid sessions on the same hardware they need to scale while providing them with pre-configured environments to iterate the ML process faster. Here, sessions are linked to GitHub, loaded with ‘JupyterHub,’ and can be accessed through SSH and IDE without installing.

Check out the Grid Session tour here. (requires Grid.ai account)

Clone ASR demo repo & open tutorial notebook

Once you have a developer environment optimised for PyTorch Lightning, the next step is to clone the NGC-Lightning-Grid-Workshop repo. After this, the user can open up the notebook to fine-tune the NGC hosted model with NeMo and PyTorch Lightning.

Install NeMo ASR dependencies

Install all the session dependencies by running tools like PyTorch Lightning and NeMo, and process the AN4 dataset. Then, run the first cell in the tutorial notebook, which runs the following bash commands to install the dependencies.

Convert and Visualise the AN4 Dataset

The AN4 dataset contains raw Sof audio files. Convert them to the Wav format so that you can use NeMo audio processing.

Once processed, you can then visualise the audio example as images of the audio waveform. The below image shows the activity in the waveform that corresponds to each letter in the audio. Each spoken letter has a different “shape.” Interestingly, the last two blobs look relatively similar because they are both the letter N.

The audio waveform of the sample example (Source: NVIDIA)

Source ; https://analyticsindiamag.com/nvidia-shows-how-to-build-ai-models-at-scale-with-pytorch-lightning/

Veille-cyber

Next WANT TO MARRY? CAN’T FIND A LIFE PARTNER? AI CAN HELP YOU »

Previous « Le Salvador face au crash-test du bitcoin

Published by

Veille-cyber

4 ans ago

Les 7 menaces cyber les plus fréquentes en entreprise

Introduction La cybersécurité est devenue une priorité stratégique pour toutes les entreprises, grandes ou petites.…

3 mois ago

cybersécurité

Cybersécurité : Vers une montée en compétence des établissements de santé grâce aux exercices de crise

Cybersécurité : les établissements de santé renforcent leur défense grâce aux exercices de crise Face…

3 mois ago

Règlementation

Règlement DORA : implications contractuelles pour les entités financières et les prestataires informatiques

La transformation numérique du secteur financier n'a pas que du bon : elle augmente aussi…

3 mois ago

cybersécurité

L’IA : opportunité ou menace ? Les DSI de la finance s’interrogent

L'IA : opportunité ou menace ? Les DSI de la finance s'interrogent Alors que l'intelligence…

3 mois ago

cybersécurité

Telegram menace de quitter la France : le chiffrement de bout en bout en ligne de mire

Telegram envisage de quitter la France : le chiffrement de bout en bout au cœur…

3 mois ago

cybersécurité

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le secteur financier

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le…

3 mois ago

This website uses cookies.

NVIDIA Shows How To Build AI Models At Scale With PyTorch Lightning

Training AI Models

Create a Grid Session

Clone ASR demo repo & open tutorial notebook

Install NeMo ASR dependencies

Convert and Visualise the AN4 Dataset

Recent Posts

Les 7 menaces cyber les plus fréquentes en entreprise

Cybersécurité : Vers une montée en compétence des établissements de santé grâce aux exercices de crise

Règlement DORA : implications contractuelles pour les entités financières et les prestataires informatiques

L’IA : opportunité ou menace ? Les DSI de la finance s’interrogent

Telegram menace de quitter la France : le chiffrement de bout en bout en ligne de mire

Sécurité des identités : un pilier essentiel pour la conformité au règlement DORA dans le secteur financier