Profanity Hashing

Profanity hashing automatically detects and replaces swear words in transcribed text with hash symbols (###). This feature is useful for content moderation and compliance.

You can apply profanity hashing during transcription or to existing text:

Option 1: Profanity Hashing During Transcription

Request:

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb2.shunyalabs.ai/v1/transcriptions"
headers = {"X-API-Key": "your-api-key"}

with open("customer_call.wav", "rb") as f:
    files = {"file": f}
    data = {
        "enable_profanity_hashing": "true"
    }

    response = requests.post(
        url,
        headers=headers,
        files=files,
        data=data
    )

print(response.json())

Example Output:

{
  "success": true,
  "text": "I can't believe this #### software keeps crashing. This is #### ridiculous!",
  "segments": [
    {
      "start": 0.0,
      "end": 5.2,
      "text": "I can't believe this #### software keeps crashing. This is #### ridiculous!",
      "speaker": "SPEAKER_00"
    }
  ]
}

Option 2: Standalone Profanity Hashing

Parameters:

text (String, required): Input text to process
enable_profanity_hashing (String, required): Set to "true" for automatic profanity detection

Automatic Profanity Detection:

Don’t forget to replace YOUR_API_KEY with your own secret key.

import requests

url = "https://tb.shunyalabs.ai/v1/text_intelligence"
headers = {"X-API-Key": "your-api-key"}

data = {
  "text": "This transcript contains some shitty language",
  "enable_profanity_hashing": "true"
}

response = requests.post(url, headers=headers, data=data)
print(response.json())

Example Output:

{
  "clean_text": "This transcript contains some #### language"
}

Use Cases for Profanity Hashing

Content Moderation: Automatically detect and clean profanity from user-generated content to ensure it meets the standards required for public-facing platforms or family-friendly environments
Compliance: Ensure that your transcribed content meets strict regulatory requirements such as HIPAA for healthcare data, GDPR for data privacy in the European Union, or other industry-specific compliance standards that mandate appropriate language usage
Broadcast Media: Prepare audio and video content for distribution on regulated television networks or radio broadcasting stations where profanity must be censored to comply with broadcasting standards and regulations