Conyac DATA : A Service for Creating Machine Learning Data

Conyac DATA is a service for creating machine learning data with the aim of improving the quality of programs such as speech recognition software and chatbots.
Utilizing the bilingual crowdsourcing platform service Conyac, we promptly deliver the creation of high volume data sets previously considered difficult to do.

Contact us now

Conyac will create and sell data for AI use according to your needs!


Speech data creation

We record various voice data by bilingual speakers around the world and create on-demand, multilingual speech data regardless of nationality, age, gender or language.


Chatbot data

We will create data from scratch for building chat systems designed to take inquiries and provide customer support.


Multilingual corpus data

Monolingual data creation is also possible in addition to parallel data. We will create a corpus that spans many fields.

Strengths of the bilingual platform Conyac

Creating high volume data is possible

Requests are handled manually by 142,000 bilingual speakers with a deep knowledge of various overseas cultures and practices, making high volume, yet useful data creation possible.

Creating data in a short period of time is possible

Even with large requests, we can assign many bilingual speakers, ensuring a quick delivery time.

Low cost data creation is possible

Outsourcing work to the 142,000 bilingual speakers around the world (the crowd) has made low costs a reality at Conyac.
We will propose preparation and deployment strategies that suit your budget.

At Conyac, apart from creating data for AI we also offer the following services:


Data cleansing

The learning of poor-quality big data will lead to system degradation. We assess and refine large volumes of data in short periods of time.


AI + crowdsourcing

The power of people in a crowdsourcing group can be utilized to complement AI. Requests to subcontract work out to bilingual speakers around the world or to make revisions to output are possible.


Data expansion

We can expand existing data through crowdsourcing. Please contact us for details.

Examples of using Conyac to create data for AI


Speech data creation

Native English speakers are provided 500 sentences and asked to read each sentence once, recording their voice using a designated smartphone app.

Volume: 500 sentences (English), 500 recordings of speakers
Turnaround: 4 weeks


Chatbot data creation

Creating 3 answer patterns in Japanese for use when a native Japanese speaker asks a situational question to a device capable of understanding speech.

Volume: 20,000 sentences
Turnaround: 3 weeks


Parallel corpus creation

Creating 5 similar sentences in English for every 1 sentence written in Japanese.

Volume: 500,000 sentences
Turnaround: 1.5 months



Leave it to Conyac for the sales and creation of machine learning data