Voice technologies are becoming foundational to digital inclusion in India, shaping how millions access public services, information, and the digital economy. Against this backdrop, a new Policy Report on voice technologies was launched at the India AI Summit Expo 2026 on February 20, 2026 to set out a policy framework supporting open, inclusive, and responsible voice technologies. Developers of the Report The Policy Report was jointly developed by ARTPARK @IISc, Digital Futures Lab and Trilegal with support from Digital India BHASHINI Division and the FAIR Forward - AI for All initiative, implemented by GIZ (German Development Cooperation) funded by the German Federal Ministry for Economic Cooperation and Development (BMZ), bringing together research, technical expertise, and ecosystem collaboration to advance responsible speech technologies in India. India’s remarkable linguistic diversity presents both opportunities and challenges for the development of inclusive voice technologies, shaping how millions can participate in the digital economy. This report examines key barriers to building open and responsible speech systems in India—from data collection and model development to infrastructure and responsible practices. It proposes policy recommendations and governance mechanisms to support an innovative and equitable voice-technology ecosystem. Key Recommendations Treating Foundational Datasets as Public Goods: Foundational datasets for speech technologies are large, reusable corpora of audio, text, and metadata. They are curated to support a wide array of downstream applications, including automatic speech recognition (ASR), text-to-speech (TTS), and speech translation. Making foundational datasets available as digital public goods addresses market failures in voice-technology ecosystems and promotes local economic innovation. The report identifies challenges arising from the linguistic diversity and nuance of Indian languages, infrastructural barriers, the absence of common data standards that create governance gaps, and unresolved intellectual property and privacy issues. Policy recommendations for building foundational language datasets include clarifying and revisiting existing laws to enable the use of publicly available material, ensuring sustainable investments supported by government and blended finance techniques, and instituting strong governance systems with shared standards, coordinated repositories, and independent quality assurance. Making foundational datasets available as digital public goods addresses market failures in voice-technology ecosystems and promotes local economic innovation. Building Open and Representative Models: India’s success in inclusive voice technologies depends on whether speech systems perform well across the country’s linguistic and social diversity. Today, limited openly available datasets, inadequate evaluation benchmarks, and uneven access to compute have resulted in models that perform inconsistently across languages, accents, and demographic groups, especially those from rural, low-resource, or marginalised communities. Strengthening India’s ecosystem requires further investment in data, evaluation, and infrastructure. Benchmarking should be built around open evaluation datasets and transparent leaderboards providing common baselines for developers, improving procurement standards, and helping assess performance across diverse languages and speaker groups. On the infrastructure side, platforms such as Bhashini, ULCA, and AI Kosh offer strong foundations although effective operation hinges on sustained governance, clear access protocols, and long-term funding models. Benchmarking should be built around open evaluation datasets and transparent leaderboards. Institutionalising Sustainable Open-Source Infrastructure: Speech datasets place far greater demands on storage, bandwidth, and compute than text data, making the financing and governance of longterm hosting a central challenge for India’s voice-tech ecosystem. A fragmented licensing landscape adds further uncertainty: overlapping or incompatible terms across data, code, model weights, and evaluation sets impose substantial compliance burdens on small actors, while enforcement gaps allow misuse with little recourse. Sustaining open, equitable development requires treating dataset hosting as durable public digital infrastructure rather than grant-based, project-specific assets. By international standards, India’s emerging platforms, such as AI Kosh, provide a remarkable foundation. However, they require longterm funding, transparent governance, and clear access pathways for non-government contributors. Collaborative stewardship models, such as the Mozilla Data Collective, can help establish shared quality norms and consistent metadata conventions. Sustaining open, equitable development requires treating dataset hosting as durable public digital infrastructure rather than grant-based, project specific assets Strengthening Responsible Deployment: Deploying speech technologies responsibly requires more than high-performance models; it depends on safe systems, contextually appropriate use, and clear accountability. Existing data practices lack value-sharing mechanisms, leaving communities and researchers without recognition or benefitsharing, even as their contributions fuel commercial products. Risks of misuse, including voice cloning, phishing, and deepfake-driven misinformation, are rising, and unintended harms like linguistic exclusion, biased performance across accents and genders, and the erosion of regional language identities remain widespread. Addressing these gaps requires structural guardrails that embed fairness, transparency, and accountability into deployment workflows. Value-sharing mechanisms can help counter extractive data practices through attribution norms, community benefit-sharing, and the use of copyleft licences for publicly funded datasets Preventing misuse demands a combination of technical safeguards, stronger legal pathways, and widespread public literacy efforts to help users recognise risks and exercise their rights. The report argues that a strong ecosystem requires more than innovation funding. Building open-source foundations—including language datasets, standards, collection protocols and responsible AI frameworks—promotes demand-driven local innovation. It is therefore essential that the state plays an active, shaping role, much as it has in the development of digital public infrastructure. In the context of voice technology, this involves both investing in commercially viable languages and sustaining low-resource languages that are vital for inclusion but unlikely to attract private capital. Open-source assets can reduce costs for the public and private sector alike. However, they demand long-term planning and financing for hosting, maintenance, and updates. These assets can be supported through blended-finance models that pool public, philanthropic, and commercial resources. Emerging national initiatives, such as the proposed AI marketplaces, can further structure participation, transparency, and value-sharing across data, annotation, and deployable models. To read the complete report, click here.