From Audio/Video to Text with Escribano Transcriptions

Our service allows automated transcription, identifying participants with over 85%* accuracy.

Launch offers. Contact us and inquire about welcome bonuses and discounts for new clients.

Results 100x Faster

No matter the number of videos or audios you send us, all will be processed in parallel, ensuring delivery within 24 hours or sometimes, even minutes.

Time Saving

Reduce Operational Workload

Cut down on operational work and human error while saving costs by 30% to 70% compared to traditional models.

Cost Reduction

Facilitate Value Addition

Focus on what really matters—adding value by leveraging your team's expertise to the fullest.

Value Generation

Modelos disponibles



Escribano TEK is a model that has been designed for the transcription of audios with a more technical language, better sound quality and smaller groups.



Escribano FAST is a 10X faster model than TEK and is well suited to conventional language. It has a better diarization with few participants up to traditional group sessions.

Características de nuestras transcripciones

Modelo técnico

TEK / optimizada

  • Análisis incluido con perfiles, hallazgos y recomendaciones
  • Identificación de speakers por número
  • Corrección de estilo y errores gramaticales
  • Ajuste de diarización
  • Recomendado para lenguaje técnico
  • Sistema de autocorrección sutil incorporado
  • Puede realizar traducción de texto
  • Ideal para entrevistas o sesiones pequeñas (4 personas)
  • Transcripción simultánea hasta de 5 audios

How It Works


Upload Your Files

Select and upload the audio or video files you need to transcribe. The system will not allow you to upload files with the same name in the same project to avoid errors. Make sure the audios are of good quality and avoid long silences at the beginning or end of the audio.


Select the model and type of transcription

You can choose between TEK or FAST for transcription, depending on your audio and the results you want to obtain. Also select whether you prefer Original or Optimized. Optimized transcription provides automated text curation by improving its compression and mapping in relation to the participating speakers. You can then click on the Process Files button.


Track Progress

The system shows the number of files and their stages: - Processing Stage, - Transcription Stage, - Optimizing Stage


Review and Download Results

You can see details of each uploaded file, including duration and types of transcriptions generated. You can also download the results. In this section, you can access Escribano Analysis to start generating customized analyses with your information (Escribano Analysis is still in beta version).

Pricing for Our Transcription Models

Prices and included features for each model

Original Transcription

USD 0.11 / per minute

Start today
  • Automatic literal transcription and diarization
  • 1 download file
  • +10 languages
  • Normal priority
  • Included technical support
  • Storage for 7 days
  • No curation or style correction
  • Access to new features
Custom Pro

USD 0.26 / per minute

Start today
  • Human-curated and diarized transcription
  • 3 different downloads
  • Only English and Spanish
  • Delivery time varies depending on the project
  • Included technical support
  • Human curation of speaker assignment
  • Contact us before starting

Preguntas Frecuentes

No, our technological infrastructure doesn't use OpenAI models for transcriptions. Our models are hosted on servers separate from OpenAI's, ensuring data privacy.

Several factors influence automated transcription quality, including audio quality, participant clarity, terminology used, and more. It's a good practice to try various models to find the best balance between results, cost, and turnaround time. For more details, visit our Blog.

To start, no automated transcription is 100% accurate. Client needs vary; while some require highly accurate transcriptions for third-party deliverables, others use transcriptions as reference. The most expensive model might not be the best fit. Our Original transcription often meets clients' needs. The Optimized model improves automated transcriptions with better speaker separation, grammar, and spelling, offering a professional 'curated' result. Human Curation, while more accurate, involves higher costs and is recommended only when necessary.

Escribano allows bulk file upload. Alternatively, we can work with a Custom Gateway to process files from your origin, or generate a 'Batch' option for offline uploading to our system.

Diarization (speaker separation) may struggle if participants speak infrequently or briefly, use one-word responses, or have similar voices. Background noise, simultaneous speaking, or echo can affect diarization as well.

For better speaker recognition, ensure each speaker talks for at least 30 seconds uninterrupted. Avoid very short responses like 'Yes', 'Right', or 'Sounds good'. Cross-talk avoidance can also help. Recording quality and devices play vital roles.

Accuracy depends on factors like audio quality, speaker count, and audio length. Ensuring each speaker talks for at least 30 seconds and avoiding brief responses can enhance accuracy. However, no model is perfect, especially in challenging scenarios.

Want to see more?

Watch our online video of Escribano Transcripciones in action or visit the contact page to schedule a personalized demonstration.