Indian Gen AI Investment report

Explore

Indian Gen AI Investment report

Foundational Models

Before foundation models, AI was task-specific and required extensive manual engineering for each application. Foundation models changed this by being versatile and multimodal, handling tasks like text, image, or audio generation. They can be standalone systems or serve as a base for other applications.

Foundation models enable:

Versatility: They can perform a wide range of tasks, such as text translation, summarization, report generation, email drafting, and content creation.

Efficiency: They can apply learned patterns from one task to another, reducing the need for separate models for each task.

Multimodality: They can handle various input formats, including text, images, or videos, making them suitable for a wide range of applications.

Broad applicability: They can be used in various industries, from customer service chatbots to content creation tools, and in research fields like natural language processing and computer vision.

Foundation models are powerful and versatile, but they also require careful consideration regarding their potential impacts and risks, such as bias, disinformation, and power concentration.

Types of Foundational Models

Types of Foundational Models

Type

Output

Purpose

Examples

Linguistics

Text

Understand & generate textual patterns

Language translation

Vision

Images & Videos

Understand and generate visual patterns

Object recognition

Audition

Audio

Understand and generate auditory patterns

Speech dubbing

Robotics

Multimodal

Software controlling hardware

Smart Homes

Reasoning

Tables & vectors

Understanding patterns in qualitative & quantitative data

Diagnostics, Planning & forecasting

There are no rows in this table

⁠

For the scope of this report we will limit to discussing the foundational models in context to India specific use cases, as the overall scope is very broad and heavy competition.

Understanding Foundational Models

Understanding Foundational Models

Factors

Description

Thesis

Underlying technology

Massive neural networks of uncategorised datasets, finding patterns in the data by assigning weightage to each unit.

Very Iterative; Lots of Experimentation required

Requires lots of time and manual intervention in large volumes to increase accuracy of the output.

Long development time frames

Highly proprietary tech

Very difficult to build

Engineering effort

High skilled engineering talent required

Very niche field; small quality talent pool

Globally a lot more lucrative opportunities are available as compared to Indian startup ecosystem

Very high engineering effort

Absence of strong talent

Data Availability

Large data sets required. It needs to be diverse to make the learning efficient & output accurate; but it can be unstructured and unlabelled

Most of the India specific data sets are not digitised, so low availability of data poses a problem in training these foundational models from Indian context

Even the data sets available are in the popular languages like Hindi, Marathi, & Bengali. Very limited data exists for other, more regional languages.

Data availability in Indian context is an issue

Low accuracy products expected

Business use cases

B2C

Low distribution effort digitally

Examples of fastest growing users

mostly works on hype and momentum; used for experimentation purposes

High churn after initial boom; users turn to specialised AI apps for specific purposes

Adoption mostly driven by direct channels via UGC on social media platforms

Expensive for individuals; if accuracy is less, very low value for money

B2B

High distribution effort

High costs of integration in the workflow at the fundamental level.

Primary business model is by driving the ecosystem of apps built on top of these foundational models.

These apps are highly fine tuned for specific use case and pass on a percentage of their revenue to these foundational models. Similar to App store or Play store.

B2B2C business model

Ecosystem of apps is necessary

Capital requirement

Very High capital infusion required for R&D (in 10s or 100s of millions)

Large ROI timelines, with very high competitions from existing players

Huge potential returns with very high risk of failure

Very High

Investment size out of scope

There are no rows in this table

⁠

Conclusion of Investment thesis on Foundational Models

Drivers & Challenges

A vast amount of data is needed to train these models effectively; which is not available for many of the Indian languages and context.

Furthermore this data needs to be available in digital format, even the data that does exists needs to be digitised, which is a huge task.

According to

2011 India Census⁠

only 11% of total population speaks English, which leaves a huge market untapped by existing western models like Chat GPT & Llamma.

Western models do have feature of translation, but it is simply translating the knowledge of western context to the desired language. Such translation often fails to capture the essence of conversation, rendering is useless of commercial applications like handeling customer queries in regional language.

Such translation requires extra compute power, making their use for commercial purposes even more expensive.

A LLM trained natively on regional languages and with Indian context could enable AI mainstream revolution in India, bringing it to the untapped market of 89% of the population.

In respect to auditory LLMs, there is again a problem of vast amount of diverse data in regional languages.

Auditory LLMs trained on majority of western data has inherent bias with respect to Indian accent, pronunciations & vocabulary, which makes it not so useful for commercial use cases.

In respect to computer vision LLMs, data training should not make a lot of difference. Any upcoming biases in the western LLMs can be finetuned without additional affect on costs.

However the difference can arise on the prompt input, since input is only available in English, it might not be as useful for the B2C segment.

Additionally there is question of data security, with AI taking more and more personal information and actively tracking it to make conversations and recommendations, how safe is it for such data to be in a foreign servers with us having no control.

Development of LLMs is not an easy task, it requires great engineering innovation and effort. This takes lot of time and investment. The typical starting capital requirement of LLM makes it difficult for early stage VCs to fund such startups. Furthermore, these new startups would be semi competitors of well established proven LLMs like Chat GPT & Llama

Early stage investment in foundational LLM does not make sense, unless following criterion meets:-

Team with exceptional track record.

Strong leadership capability to attract best talent pool.

A technological innovation that brings down development & training time significantly.

Strong interest from Entrepreneur community to participate in such ecosystem.

Large fund size.

⁠

Gallery

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.