Frequently Asked Questions

Common questions about IdentityCall’s API, features, and implementation.

General Questions

What is IdentityCall?

IdentityCall is a call recording analysis platform that provides:

Transcription: Convert audio to text with speaker diarization
Emotion Detection: Analyze emotional tone per speaker
Goal Evaluation: AI-powered assessment of call objectives
Pause Analysis: Detect significant pauses for compliance
Voice Biometrics: Speaker identification via voice profiles

What audio formats are supported?

Format	Extension	Max Size
MP3	.mp3	100 MB
WAV	.wav	500 MB
M4A	.m4a	100 MB
OGG	.ogg	100 MB
FLAC	.flac	500 MB
WebM	.webm	100 MB

What languages are supported?

IdentityCall uses Google Gemini for transcription, supporting 100+ languages. Common languages include English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Hindi, and Russian.

See the complete language list.

How long does transcription take?

Processing time depends on audio length:

Under 5 minutes: 30-60 seconds
5-15 minutes: 1-3 minutes
15-60 minutes: 3-10 minutes
Over 60 minutes: 10-20 minutes

Transcription runs asynchronously. Upload your file, receive an ID, and poll for completion or use webhooks.

Authentication & Access

How do I get an API key?

API keys are generated from your IdentityCall dashboard:

Log in at app.identitycall.ai
Navigate to Settings > API Keys
Click “Create API Key”
Set permissions (read, write, delete)
Copy and securely store your key

Keys are prefixed with idc_ and should be passed as Bearer tokens.

What permissions do API keys have?

API keys support three permission levels:

read: List and view recordings, transcriptions, results
write: Upload new recordings, update metadata
delete: Remove recordings permanently

Set minimum required permissions for security.

How do I authenticate requests?

Include your API key in the Authorization header:


Authorization: Bearer idc_your_api_key

What happens if my API key is compromised?

Immediately revoke the key in your dashboard
Create a new key with appropriate permissions
Update your applications with the new key
Review API logs for unauthorized access

Transcription & Analysis

How accurate is the transcription?

Transcription accuracy depends on audio quality:

Clear audio: 95%+ accuracy
Background noise: 85-95% accuracy
Multiple speakers talking: 80-90% accuracy

For best results, use recordings with clear audio and minimal background noise.

How does speaker diarization work?

Speaker diarization identifies and labels different speakers in a recording. IdentityCall:

Detects voice changes throughout the audio
Clusters similar voice segments
Assigns consistent labels (Speaker 1, Speaker 2, etc.)
Optionally matches against enrolled voice profiles

What emotions are detected?

Each dialogue segment includes probability scores for:

Happy
Neutral
Calm
Sad
Angry
Fearful
Surprised
Disgust

Scores range from 0.0 to 1.0 and sum to approximately 1.0.

How do goals work?

Goals are configured in your project settings with:

Name: Descriptive title
Description: What constitutes achievement
Criteria: AI evaluation rules

After transcription, the AI evaluates each goal and returns:

Whether the goal was met (boolean)
Confidence score (0-100)
Explanation of the evaluation

What makes a pause “non-compliant”?

Pause compliance is determined by your project’s configured thresholds:

Minimum duration: Pauses shorter than this are ignored
Maximum duration: Pauses longer than this are flagged
Context rules: Specific requirements for regulated scenarios

Non-compliant pauses may indicate missed disclosures or inappropriate wait times.

Voice Biometrics

How do voice profiles work?

Voice profiles capture unique voice characteristics:

Enrollment: Upload audio of a known speaker
Embedding: Extract voice signature
Matching: Compare against future recordings

When processing new recordings, IdentityCall matches speaker segments against enrolled profiles.

How many samples are needed for enrollment?

Minimum requirements:

Duration: At least 10 seconds of clear speech
Quality: Clear audio without background noise
Variety: Natural speaking (not reading scripted text)

More samples improve matching accuracy.

How accurate is voice matching?

Voice biometric accuracy varies:

Same conditions: 95%+ accuracy
Different phones/microphones: 85-95% accuracy
Background noise: 75-90% accuracy

Matching returns confidence scores to help set thresholds.

API Usage

What are the rate limits?

Rate limits depend on your subscription:

Starter: 60 requests/minute
Professional: 300 requests/minute
Enterprise: Custom limits

Rate limit headers are included in all responses:


X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1642345678

How do I handle rate limiting?

When rate limited (HTTP 429), the response includes:

retry_after: Seconds to wait before retrying

Implement exponential backoff:


import time
 
def make_request_with_retry(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError as e:
            if attempt < max_retries - 1:
                time.sleep(e.retry_after)
            else:
                raise

Is there a webhook for completion notifications?

Yes, webhooks can notify your application when processing completes. Configure webhook URLs in your project settings. Webhook payloads include:

Recording ID
Status (completed, failed)
Timestamp

How do I handle pagination?

List endpoints support pagination:


GET /recordings?page=1&per_page=50

Responses include metadata:


{
  "meta": {
    "current_page": 1,
    "total_pages": 10,
    "total_count": 487,
    "per_page": 50
  }
}

Iterate through pages until current_page >= total_pages.

Can I delete recordings after processing?

Yes, use the DELETE endpoint:


DELETE /recordings/{id}

This permanently removes:

Audio file
Transcription
Analysis results
Associated metadata

This action cannot be undone.

Security & Privacy

How is my data secured?

IdentityCall implements multiple security measures:

Encryption in transit: TLS 1.3 for all API connections
Encryption at rest: AES-256 for stored data
Access controls: Role-based permissions
Audit logging: All API access is logged

Where is data stored?

Data is stored in secure cloud infrastructure:

Primary: EU (Germany)
Backups: Geographically distributed

Enterprise plans support data residency requirements.

How long is data retained?

Default retention periods:

Audio files: 90 days
Transcriptions: Until deleted
Analysis results: Until recording deleted

Enterprise plans support custom retention policies.

Yes, IdentityCall is designed for GDPR compliance:

Data processing agreements available
Right to deletion supported
Data export functionality
Audit logs maintained

Contact support for compliance documentation.

Troubleshooting

Why did my transcription fail?

Common failure reasons:

Invalid audio format: Use supported formats
Corrupted file: Re-encode the audio
Too short: Minimum 1 second of speech
No speech detected: Check audio contains speech

Check the recording’s error details in the response.

Why are speakers not identified?

Speaker identification requires:

Enrolled profiles: Create voice profiles first
Sufficient audio: At least 5 seconds per speaker
Clear audio: Background noise affects matching

Why are emotions showing unexpected values?

Emotion detection may be affected by:

Audio quality: Background noise skews results
Speaker volume: Very quiet speech is harder to analyze
Non-speech sounds: Laughter, coughing affect scores

How do I debug API requests?

Check response status codes and error messages
Verify authentication header format
Confirm request body/parameters are valid
Review rate limit headers
Contact support with request IDs for assistance

Integration

Do you have SDKs?

Yes, SDKs are available for:

SDKs handle authentication, retries, and error handling.

Can I use the API with my existing CRM?

Yes, the REST API integrates with any system that can make HTTP requests. Common integrations:

Salesforce: Via Apex HTTP callouts
HubSpot: Via workflow webhooks
Zendesk: Via custom apps

Is there a Zapier integration?

Not currently. Use the REST API directly or webhooks for automation.

Can I process recordings in real-time?

The current API processes completed recordings. For live call analysis, contact sales about our real-time streaming solution.

Still Have Questions?

Email our support team

Contact Support

Complete endpoint documentation

API Reference