datasources.audio_to_text.audio_to_text
Audio Upload to Text
This data source acts similar to a Preset, but because it needs SearchMedia's validate_query and after_create methods to run, chaining that processor does not work (Presets essentially only run the process and after_process methods of their processors and skip those two datasource only methods).
1""" 2Audio Upload to Text 3 4This data source acts similar to a Preset, but because it needs SearchMedia's validate_query and after_create methods 5to run, chaining that processor does not work (Presets essentially only run the process and after_process methods 6of their processors and skip those two datasource only methods). 7""" 8 9from datasources.media_import.import_media import SearchMedia 10from processors.machine_learning.whisper_speech_to_text import AudioToText 11 12 13class AudioUploadToText(SearchMedia): 14 type = "upload-audio-to-text-search" # job ID 15 category = "Search" # category 16 title = "Convert speech to text" # title displayed in UI 17 description = "Upload your own audio and use OpenAI's Whisper model to create transcripts" # description displayed in UI 18 19 @classmethod 20 def is_compatible_with(cls, module=None, user=None): 21 #TODO: False here does not appear to actually remove the datasource from the "Create dataset" page so technically 22 # this method is not necessary; if we can adjust that behavior, it ought to function as intended 23 24 # Ensure the Whisper model is available 25 return AudioToText.is_compatible_with(module=module, user=user) 26 27 @classmethod 28 def get_options(cls, parent_dataset=None, user=None): 29 # We need both sets of options for this datasource 30 media_options = SearchMedia.get_options(parent_dataset=parent_dataset, user=user) 31 whisper_options = AudioToText.get_options(parent_dataset=parent_dataset, user=user) 32 media_options.update(whisper_options) 33 34 #TODO: there are some odd formatting issues if we use those derived options 35 # The intro help text is not displayed correct (does not wrap) 36 # Advanced Settings uses []() links which do not work on the "Create dataset" page, so we adjust 37 38 media_options["intro"]["help"] = ("Upload audio files here to convert speech to text. " 39 "4CAT will use OpenAI's Whisper model to create transcripts." 40 "\n\nFor information on using advanced settings: [Command Line Arguments (CLI)](https://github.com/openai/whisper/blob/248b6cb124225dd263bb9bd32d060b6517e067f8/whisper/transcribe.py#LL374C3-L374C3)") 41 media_options["advanced"]["help"] = "Advanced Settings" 42 43 return media_options 44 45 @staticmethod 46 def validate_query(query, request, user): 47 # We need SearchMedia's validate_query to upload the media 48 media_query = SearchMedia.validate_query(query, request, user) 49 50 # Here's the real trick: act like a preset and add another processor to the pipeline 51 media_query["next"] = [{"type": "audio-to-text", 52 "parameters": query.copy()}] 53 return media_query
14class AudioUploadToText(SearchMedia): 15 type = "upload-audio-to-text-search" # job ID 16 category = "Search" # category 17 title = "Convert speech to text" # title displayed in UI 18 description = "Upload your own audio and use OpenAI's Whisper model to create transcripts" # description displayed in UI 19 20 @classmethod 21 def is_compatible_with(cls, module=None, user=None): 22 #TODO: False here does not appear to actually remove the datasource from the "Create dataset" page so technically 23 # this method is not necessary; if we can adjust that behavior, it ought to function as intended 24 25 # Ensure the Whisper model is available 26 return AudioToText.is_compatible_with(module=module, user=user) 27 28 @classmethod 29 def get_options(cls, parent_dataset=None, user=None): 30 # We need both sets of options for this datasource 31 media_options = SearchMedia.get_options(parent_dataset=parent_dataset, user=user) 32 whisper_options = AudioToText.get_options(parent_dataset=parent_dataset, user=user) 33 media_options.update(whisper_options) 34 35 #TODO: there are some odd formatting issues if we use those derived options 36 # The intro help text is not displayed correct (does not wrap) 37 # Advanced Settings uses []() links which do not work on the "Create dataset" page, so we adjust 38 39 media_options["intro"]["help"] = ("Upload audio files here to convert speech to text. " 40 "4CAT will use OpenAI's Whisper model to create transcripts." 41 "\n\nFor information on using advanced settings: [Command Line Arguments (CLI)](https://github.com/openai/whisper/blob/248b6cb124225dd263bb9bd32d060b6517e067f8/whisper/transcribe.py#LL374C3-L374C3)") 42 media_options["advanced"]["help"] = "Advanced Settings" 43 44 return media_options 45 46 @staticmethod 47 def validate_query(query, request, user): 48 # We need SearchMedia's validate_query to upload the media 49 media_query = SearchMedia.validate_query(query, request, user) 50 51 # Here's the real trick: act like a preset and add another processor to the pipeline 52 media_query["next"] = [{"type": "audio-to-text", 53 "parameters": query.copy()}] 54 return media_query
Abstract processor class
A processor takes a finished dataset as input and processes its result in some way, with another dataset set as output. The input thus is a file, and the output (usually) as well. In other words, the result of a processor can be used as input for another processor (though whether and when this is useful is another question).
To determine whether a processor can process a given dataset, you can
define a is_compatible_with(FourcatModule module=None, str user=None):) -> bool
class
method which takes a dataset as argument and returns a bool that determines
if this processor is considered compatible with that dataset. For example:
@classmethod def is_compatible_with(cls, module=None, user=None): return module.type == "linguistic-features"
20 @classmethod 21 def is_compatible_with(cls, module=None, user=None): 22 #TODO: False here does not appear to actually remove the datasource from the "Create dataset" page so technically 23 # this method is not necessary; if we can adjust that behavior, it ought to function as intended 24 25 # Ensure the Whisper model is available 26 return AudioToText.is_compatible_with(module=module, user=user)
28 @classmethod 29 def get_options(cls, parent_dataset=None, user=None): 30 # We need both sets of options for this datasource 31 media_options = SearchMedia.get_options(parent_dataset=parent_dataset, user=user) 32 whisper_options = AudioToText.get_options(parent_dataset=parent_dataset, user=user) 33 media_options.update(whisper_options) 34 35 #TODO: there are some odd formatting issues if we use those derived options 36 # The intro help text is not displayed correct (does not wrap) 37 # Advanced Settings uses []() links which do not work on the "Create dataset" page, so we adjust 38 39 media_options["intro"]["help"] = ("Upload audio files here to convert speech to text. " 40 "4CAT will use OpenAI's Whisper model to create transcripts." 41 "\n\nFor information on using advanced settings: [Command Line Arguments (CLI)](https://github.com/openai/whisper/blob/248b6cb124225dd263bb9bd32d060b6517e067f8/whisper/transcribe.py#LL374C3-L374C3)") 42 media_options["advanced"]["help"] = "Advanced Settings" 43 44 return media_options
Get processor options
This method by default returns the class's "options" attribute, or an empty dictionary. It can be redefined by processors that need more fine-grained options, e.g. in cases where the availability of options is partially determined by the parent dataset's parameters.
Parameters
- DataSet parent_dataset: An object representing the dataset that the processor would be run on
- User user: Flask user the options will be displayed for, in case they are requested for display in the 4CAT web interface. This can be used to show some options only to privileges users.
46 @staticmethod 47 def validate_query(query, request, user): 48 # We need SearchMedia's validate_query to upload the media 49 media_query = SearchMedia.validate_query(query, request, user) 50 51 # Here's the real trick: act like a preset and add another processor to the pipeline 52 media_query["next"] = [{"type": "audio-to-text", 53 "parameters": query.copy()}] 54 return media_query
Step 1: Validate query and files
Confirms that the uploaded files exist and that the media type is valid.
Parameters
- dict query: Query parameters, from client-side.
- request: Flask request
- User user: User object of user who has submitted the query
Returns
Safe query parameters
Inherited Members
- backend.lib.worker.BasicWorker
- BasicWorker
- INTERRUPT_NONE
- INTERRUPT_RETRY
- INTERRUPT_CANCEL
- queue
- log
- manager
- interrupted
- modules
- init_time
- name
- run
- clean_up
- request_interrupt
- is_4cat_class
- datasources.media_import.import_media.SearchMedia
- extension
- is_local
- is_static
- max_workers
- disallowed_characters
- accepted_file_types
- after_create
- process
- get_safe_filename
- backend.lib.processor.BasicProcessor
- db
- job
- dataset
- owner
- source_dataset
- source_file
- config
- is_running_in_preset
- filepath
- work
- after_process
- remove_files
- abort
- add_field_to_parent
- iterate_archive_contents
- unpack_archive_contents
- extract_archived_file_by_name
- write_csv_items_and_finish
- write_archive_and_finish
- create_standalone
- map_item_method_available
- get_mapped_item
- is_filter
- get_status
- is_top_dataset
- is_from_collector
- get_extension
- is_rankable
- exclude_followup_processors
- is_4cat_processor