Asynchronous recognition
Automation of speech recognition using advanced technologies.
Asynchronous recognition, suitable for recognizing audio and video recordings.
Main actions:
1. Upload the file for recognition via HTTP protocol to our repository.
2. Run the recognition task: in the request, pass the name of the downloaded file and the recognition parameters. The response will contain the recognized text.
File upload code:
curl https://yazapishu.ru/api/upload.php \
--header "Authorization: <api_token>" \
-F "upload=@<file_name>"
where:
api_token - unique user identifier. Issued after registration. Used for authorization.
file_name - the name of your file to recognize (may contain the path to your file).
Recognition launch code:
curl https://yazapishu.ru/api.php \
--header "Authorization: <api_token>" \
--header "Content-Type: application/json" \
--data '{
"audio_name": "<file_name>",
"language": "<language_cod>"
}'
List of additional parameters:
file_name - the name of the uploaded file for recognition (only the name of your file with the extension. Example : audio1.mp3).
"language": "<language_cod>" - language recognition
"speaker": "-l" - split text into speakers
"timecod": "yes" - specify timecodes in text
"json": "yes" - get data in JSON format
Available recognition languages:
Russian - ru, English - en , English – British - en_uk, English – American - en_us, English – Australian - en_au, Spanish - es ,
Italian - it , Chinese - zh , Korean - ko , German - de , Dutch - nl , Polish - pl , Portuguese - pt , Turkish - tr ,
French - fr , Finnish - fi , Japanese - ja
Example parameter: "language": "ru"
Getting data in Json format
Recognition launch code:
curl https://yazapishu.ru/api.php \
--header "Authorization: <api_token>" \
--header "Content-Type: application/json" \
--data '{
"audio_name": "<file_name>",
"language": "<language_cod>",
"json": "yes"
}'
The results contain the entire recognized text, the text divided by parameters, and a list of recognized words.
The response will be data in JSON format, where:
Key | Type | Description |
---|---|---|
text | string | Transcript audio file . |
words | array | An array containing information about each word |
utterances | array | Array containing speakers' statements |
utterances[i].speaker | string | Statement of a specific speaker |
words[i].text | string | Text of the i-th word in the transcript |
words[i].start | number | The beginning of the pronunciation of this word in the audio file, in milliseconds. |
words[i].end | number | The end of the pronunciation of this word in the audio file, in milliseconds. |
words[i].confidence | number | Reliability assessment for decoding the i-th word |
words[i].speaker | string | If the "Speaker Separation" feature is enabled, then the speaker who uttered the i-th word |
status | string | Status recognition : completed or error |
audio_duration | number | Audio Duration |
id | number | Current recognition identifier |
Request for completed recognitions by ID:
curl https://yazapishu.ru/api/api_request.php \
--header "Authorization: <api_token>" \
--header "Content-Type: application/json" \
--data '{
"id": "<id>"
}'
Example of performing recognition in the SHELL command shell:
audio='myspeech.mp3'
curl https://yazapishu.ru/api/upload.php \
--header "Authorization: 773fec3ac285bc7c9e9951ef7f7ddad8" \
-F "upload=@$audio"
text=$(curl https://yazapishu.ru/api.php \
--header "Authorization: 773fec3ac285bc7c9e9951ef7f7ddad8" \
--header "Content-Type: application/json" \
--data '{
"audio_name": "'$audio'",
"language": "en",
"speaker": "-l",
"json": "yes"
}')
printf '%s\n' "$text"
You can upload audio and video files to the system: mp3,wav,mp4,avi,aac,m4a, ac3,flac,ogg,wma,mov,flv,3gp,asf,wmv,mkv,webm.
All recognition results will also be available in your personal account.
You can run up to 100 parallel recognitions.