AI4Bhārat Launch at IIT Madras 28th July 2022

We are launching!

Nilekani Center at AI4Bharat is launching on 28th July 2022 at IIT Madras. Co-located with the launch will be the first AI4Bharat Workshop on Indian Language AI.

Learn More
AI4Bharat

Our Mission

Bring parity with respect to English in AI technologies for Indian languages with open-source contributions in datasets, models, and applications and by enabling an innovation ecosystem
-

Our Impact Axes

Data

Curate and create the largest public datasets and benchmarks across various tasks and 22 Indian languages.

AI Models

Build state-of-the-art, open, foundational AI models across tasks and 22 Indian languages.

Applications

Design and deploy with partners reference applications to demonstrate potential of open AI models.

Ecosystem

Enable researchers, startups, and govt. to innovate on Indian language AI tech with educational material and workshops.

Areas

Translation

Open-source datasets (Samanantar) and models (IndicTrans) for neural machine translation between English and 12 Indic languages.

Know More →

Transliteration

Open-source datasets and benchmarks (Aksharantar), models (IndicXlit), and applications for transliteration between Roman and scripts for 20+ Indic languages.

Know More →

Speech Recognition

Open-source models (IndicWav2Vec) for speech recognition in 9 Indian languages.

Know More →

Language Understanding

Open-source language models (IndicBERT), benchmarks (IndicGLUE), and entity recognizers (IndicNER) for 10 Indian languages.

Know More →

Language Generation

Open-source language generation model (IndicBART) and benchmarks (IndicNLG Suite) for 10 Indian languages.

Know More →

Sign Language

Open-source datasets (INCLUDE, SignCorpus) and models (OpenHands) for sign recognition for various 10 sign languages from around the world.

Know More →

Coming Soon
Shoonya

Open-source workbench for AI-assisted language work on Indian languages with initial focus on translation.

Know More →

Coming Soon
Chitralekha

Open-source tool for AI-assisted video subtitling and translating with a focus on educational and media content.

Know More →

Our Sponsors

Nandan Nilekani

As the primary sponsor, Nandan Nilekani has generously contributed to the formation of the AI4Bharat center with a focus on open-source language tech as a public good. The team at EkStep Foundation also closely collaborates and mentors the center.

MeitY, Govt of India

AI4Bharat is the official Data Management Unit (DMU) of the Digital India Bhasini project. As part of the DMU, AI4Bharat is collecting datasets across India’s 22 scheduled languages.

Microsoft

Microsoft’s Research Lab and India Development Center (IDC) have supported AI4Bharat with unrestricted research grants and time for researchers to contribute towards open-source technologies.

CDAC

CDAC has provided generous access to super-computing resources for training large AI models and hosting large amounts of data

Our Team

Vivek Raghavan

Chief mentor and evangelist, EkStep Foundation

Mitesh Khapra

Associate Professor at CSE Department, IIT Madras

Pratyush Kumar

Researcher at Microsoft Research and Adjunct Faculty at IIT Madras

Anoop Kunchukuttan

Researcher at Microsoft

Positions

AI4Bharat Residency

1 year
Remote

1-year program for recent graduates to work on cutting-edge research problems in NLP, Speech, and systems-engineering for AI.

AI4Bharat Internship

1 semester
Remote

1-semester program for current students to work on data and software engineering for language AI technologies

Language Contributor

Long-term
Remote

Long-term opportunities for experienced language translators and transcribers across Indian languages.

Software Developer

Long-term
In-person

Long-term opportunities for front-end and back-end developers looking to contribute to building open-source applications for language technologies.

We must be second to none in the application of advanced technologies to the real problems of man and society.
- Vikram Sarabhai