lyrics-transcriber 0.20.0__py3-none-any.whl → 0.30.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- lyrics_transcriber/__init__.py +2 -5
- lyrics_transcriber/cli/cli_main.py +206 -0
- lyrics_transcriber/core/__init__.py +0 -0
- lyrics_transcriber/core/controller.py +317 -0
- lyrics_transcriber/correction/base_strategy.py +29 -0
- lyrics_transcriber/correction/corrector.py +52 -0
- lyrics_transcriber/correction/strategy_diff.py +263 -0
- lyrics_transcriber/lyrics/base_lyrics_provider.py +201 -0
- lyrics_transcriber/lyrics/genius.py +70 -0
- lyrics_transcriber/lyrics/spotify.py +82 -0
- lyrics_transcriber/output/__init__.py +0 -0
- lyrics_transcriber/output/generator.py +271 -0
- lyrics_transcriber/{utils → output}/subtitles.py +12 -12
- lyrics_transcriber/storage/__init__.py +0 -0
- lyrics_transcriber/storage/dropbox.py +225 -0
- lyrics_transcriber/transcribers/audioshake.py +216 -0
- lyrics_transcriber/transcribers/base_transcriber.py +186 -0
- lyrics_transcriber/transcribers/whisper.py +321 -0
- {lyrics_transcriber-0.20.0.dist-info → lyrics_transcriber-0.30.1.dist-info}/METADATA +5 -16
- lyrics_transcriber-0.30.1.dist-info/RECORD +25 -0
- lyrics_transcriber-0.30.1.dist-info/entry_points.txt +3 -0
- lyrics_transcriber/audioshake_transcriber.py +0 -122
- lyrics_transcriber/corrector.py +0 -57
- lyrics_transcriber/llm_prompts/README.md +0 -10
- lyrics_transcriber/llm_prompts/llm_prompt_lyrics_correction_andrew_handwritten_20231118.txt +0 -55
- lyrics_transcriber/llm_prompts/llm_prompt_lyrics_correction_gpt_optimised_20231119.txt +0 -36
- lyrics_transcriber/llm_prompts/llm_prompt_lyrics_matching_andrew_handwritten_20231118.txt +0 -19
- lyrics_transcriber/llm_prompts/promptfooconfig.yaml +0 -61
- lyrics_transcriber/llm_prompts/test_data/ABBA-UnderAttack-Genius.txt +0 -48
- lyrics_transcriber/transcriber.py +0 -934
- lyrics_transcriber/utils/cli.py +0 -179
- lyrics_transcriber-0.20.0.dist-info/RECORD +0 -19
- lyrics_transcriber-0.20.0.dist-info/entry_points.txt +0 -3
- /lyrics_transcriber/{utils → cli}/__init__.py +0 -0
- /lyrics_transcriber/{utils → output}/ass.py +0 -0
- {lyrics_transcriber-0.20.0.dist-info → lyrics_transcriber-0.30.1.dist-info}/LICENSE +0 -0
- {lyrics_transcriber-0.20.0.dist-info → lyrics_transcriber-0.30.1.dist-info}/WHEEL +0 -0
@@ -1,36 +0,0 @@
|
|
1
|
-
You are a song lyric corrector for a karaoke video studio, specializing in correcting lyrics for synchronization with music videos. Your role involves processing lyrics inputs, making corrections, and generating JSON responses with accurate lyrics aligned to timestamps.
|
2
|
-
|
3
|
-
Task:
|
4
|
-
- Receive lyrics data inputs of varying quality.
|
5
|
-
- Use one data set to correct the other, ensuring lyrics are accurate and aligned with approximate song timestamps.
|
6
|
-
- Generate responses in JSON format, to be converted to Python dictionaries for an API endpoint.
|
7
|
-
|
8
|
-
Data Inputs:
|
9
|
-
- Reference Lyrics: Published song lyrics from various online sources, generally accurate but not flawless. Be aware of potentially missing or incorrect sections (e.g., choruses, outros).
|
10
|
-
- Transcription Segment: Automated machine transcription of a song segment, with timestamps and word confidence scores. Transcription accuracy varies (70% to 90%), with occasional misheard words or misinterpreted phrases.
|
11
|
-
|
12
|
-
Additional Context:
|
13
|
-
- When available, you'll receive the previous 2 corrected lines and the next 1 uncorrected segment for context.
|
14
|
-
|
15
|
-
Correction Guidelines:
|
16
|
-
- Take a deep breath and carefully analyze the transcription segment against the reference lyrics to find corresponding parts.
|
17
|
-
- Maintain the transcription segment if it completely matches the reference lyrics.
|
18
|
-
- Correct misheard or similar-sounding words.
|
19
|
-
- Incorporate symbols (like parentheses) into the nearest word, not as separate entries.
|
20
|
-
- Removing a word or two for accuracy is permissible.
|
21
|
-
|
22
|
-
Segment Considerations:
|
23
|
-
- Transcription segments may not align perfectly with published lyric lines due to subjective line splitting.
|
24
|
-
- Be cautious of adding words to the transcription; prioritize correction over completion.
|
25
|
-
- Avoid duplicating words already present in the "Next (un-corrected) transcript segment".
|
26
|
-
|
27
|
-
JSON Response Structure:
|
28
|
-
- id: Segment ID from input data.
|
29
|
-
- text: Corrected lyrics for the segment.
|
30
|
-
- words: List of words with the following details for each:
|
31
|
-
- text: Correct word.
|
32
|
-
- start: Estimated start timestamp.
|
33
|
-
- end: Estimated end timestamp.
|
34
|
-
- confidence: Confidence score (0-1) on word accuracy. Retain existing score if unchanged.
|
35
|
-
|
36
|
-
Focus on precision and context sensitivity to ensure the corrections are relevant and accurate. Your objective is to refine the lyrical content for an optimal karaoke experience.
|
@@ -1,19 +0,0 @@
|
|
1
|
-
You are a song lyric matcher for a karaoke video studio, responsible for reading lyrics inputs and identifying if they match, according to predefined criteria.
|
2
|
-
|
3
|
-
Your task is to take two lyrics data inputs, and determine if they are from the same song or not.
|
4
|
-
Your response must be either "Yes" or "No", with no other text, as your response will be processed by some Python code.
|
5
|
-
|
6
|
-
Data input 1 will be lyrics generated from a song using automated machine transcription.
|
7
|
-
Generally the transcription is at least 50% accurate, but some of the words heard by the transcription will likely be homonyms or mistakes.
|
8
|
-
|
9
|
-
Data input 2 will be published lyrics for a song, fetched from an online source.
|
10
|
-
If they are for the same song, these should be at least 90% accurate, with generally correct words and phrases.
|
11
|
-
Even when they are for the same song, they may not be perfect. Sometimes whole sections (such as a chorus or outro) may be missing or assumed to be repeated.
|
12
|
-
|
13
|
-
There is a chance the lyrics in data input 2 may be for a totally different song, as the automated process fetching lyrics from online sources sometimes gets an erroneous match.
|
14
|
-
In this scenario, there may be one or two words which still match up by coincidence but generally you would expect less than 10% of the lyrics to match up.
|
15
|
-
This "totally different song" scenario is what you need to detect, and return "No".
|
16
|
-
|
17
|
-
Carefully analyse the two lyrics inputs provided, and make a reasonable guess as to whether they are for the same song or not.
|
18
|
-
If the lyrics look like they are from the same song (but perhaps with some minor differences), you should return "Yes".
|
19
|
-
If the lyrics look totally different, or you are not sure if the lyrics are both from the same song, you should return "No"
|
@@ -1,61 +0,0 @@
|
|
1
|
-
# This configuration runs each prompt through a series of example inputs and checks if they meet requirements.
|
2
|
-
# Learn more: https://promptfoo.dev/docs/configuration/guide
|
3
|
-
|
4
|
-
description: Song lyric corrector for a karaoke video studio, responsible for reading lyrics inputs, correcting them and generating JSON-based responses containing the corrected lyrics according to predefined criteria.
|
5
|
-
providers:
|
6
|
-
- id: openai:gpt-3.5-turbo-1106
|
7
|
-
config:
|
8
|
-
temperature: 0
|
9
|
-
# - id: openai:gpt-4-1106-preview
|
10
|
-
# config:
|
11
|
-
# temperature: 0
|
12
|
-
prompts:
|
13
|
-
- file://llm_prompt_lyrics_correction_andrew_handwritten_20231118.txt
|
14
|
-
|
15
|
-
defaultTest:
|
16
|
-
assert:
|
17
|
-
- type: is-json
|
18
|
-
value:
|
19
|
-
required: [id, text, words]
|
20
|
-
type: object
|
21
|
-
properties:
|
22
|
-
id:
|
23
|
-
type: number
|
24
|
-
text:
|
25
|
-
type: string
|
26
|
-
words:
|
27
|
-
type: array
|
28
|
-
items:
|
29
|
-
type: object
|
30
|
-
properties:
|
31
|
-
text:
|
32
|
-
type: string
|
33
|
-
start:
|
34
|
-
type: number
|
35
|
-
end:
|
36
|
-
type: number
|
37
|
-
confidence:
|
38
|
-
type: number
|
39
|
-
|
40
|
-
tests:
|
41
|
-
- description: ABBA - Under Attack (segment 0)
|
42
|
-
vars:
|
43
|
-
reference_lyrics: file://test_data/ABBA-UnderAttack-Genius.txt
|
44
|
-
previous_two_corrected_lines:
|
45
|
-
upcoming_two_uncorrected_lines:
|
46
|
-
segment_input: |
|
47
|
-
{"id": 0, "start": 17.46, "end": 21.3, "confidence": 0.792, "text": " Don't know how to take it, don't know where to go", "words": [{"text": "Don't", "start": 17.46, "end": 18.2, "confidence": 0.278}, {"text": "know", "start": 18.2, "end": 18.42, "confidence": 0.965}, {"text": "how", "start": 18.42, "end": 18.66, "confidence": 0.865}, {"text": "to", "start": 18.66, "end": 18.88, "confidence": 0.994}, {"text": "take", "start": 18.88, "end": 19.2, "confidence": 0.992}, {"text": "it,", "start": 19.2, "end": 19.44, "confidence": 0.974}, {"text": "don't", "start": 19.56, "end": 19.8, "confidence": 0.917}, {"text": "know", "start": 19.8, "end": 20.02, "confidence": 0.989}, {"text": "where", "start": 20.02, "end": 20.46, "confidence": 0.963}, {"text": "to", "start": 20.46, "end": 20.76, "confidence": 0.983}, {"text": "go", "start": 20.76, "end": 21.3, "confidence": 0.982}]}
|
48
|
-
assert:
|
49
|
-
- type: contains
|
50
|
-
value: "Don't know how to take it, don't know where to go"
|
51
|
-
|
52
|
-
- description: ABBA - Under Attack (segment 1)
|
53
|
-
vars:
|
54
|
-
reference_lyrics: file://test_data/ABBA-UnderAttack-Genius.txt
|
55
|
-
previous_two_corrected_lines:
|
56
|
-
upcoming_two_uncorrected_lines:
|
57
|
-
segment_input: |
|
58
|
-
{"id": 1, "start": 22.04, "end": 27.84, "confidence": 0.763, "text": " My resistance running low And every day the hole is getting tighter", "words": [{"text": "My", "start": 22.04, "end": 22.32, "confidence": 0.535}, {"text": "resistance", "start": 22.32, "end": 22.94, "confidence": 0.936}, {"text": "running", "start": 22.94, "end": 23.66, "confidence": 0.89}, {"text": "low", "start": 23.66, "end": 24.36, "confidence": 0.999}, {"text": "And", "start": 24.36, "end": 25.14, "confidence": 0.485}, {"text": "every", "start": 25.14, "end": 25.56, "confidence": 0.568}, {"text": "day", "start": 25.56, "end": 25.88, "confidence": 0.997}, {"text": "the", "start": 25.88, "end": 26.1, "confidence": 0.959}, {"text": "hole", "start": 26.1, "end": 26.48, "confidence": 0.361}, {"text": "is", "start": 26.48, "end": 26.68, "confidence": 0.947}, {"text": "getting", "start": 26.68, "end": 27.08, "confidence": 0.996}, {"text": "tighter", "start": 27.08, "end": 27.84, "confidence": 0.975}]}
|
59
|
-
assert:
|
60
|
-
- type: contains
|
61
|
-
value: "My resistance running low And every day the hold is getting tighter"
|
@@ -1,48 +0,0 @@
|
|
1
|
-
Don't know how to take it, don't know where to go
|
2
|
-
My resistance running low
|
3
|
-
And every day the hold is getting tighter and it troubles me so
|
4
|
-
(You know that I'm nobody's fool)
|
5
|
-
I'm nobody's fool and yet it's clear to me
|
6
|
-
I don't have a strategy
|
7
|
-
It's just like taking candy from a baby and I think I must be
|
8
|
-
|
9
|
-
Under attack, I'm being taken
|
10
|
-
About to crack, defences breaking
|
11
|
-
Won't somebody please have a heart
|
12
|
-
Come and rescue me now 'cause I'm falling apart
|
13
|
-
Under attack, I'm taking cover
|
14
|
-
He's on my track, my chasing lover
|
15
|
-
Thinking nothing can stop him now
|
16
|
-
Should I want to, I'm not sure I would know how
|
17
|
-
|
18
|
-
This is getting crazy, I should tell him so
|
19
|
-
Really let my anger show
|
20
|
-
Persuade him that the answer to his questions is a definite no
|
21
|
-
(I'm kind of flattered I suppose)
|
22
|
-
Guess I'm kind of flattered but I'm scared as well
|
23
|
-
Something like a magic spell
|
24
|
-
I hardly dare to think of what would happen, where I'd be if I fell
|
25
|
-
|
26
|
-
Under attack, I'm being taken
|
27
|
-
About to crack, defences breaking
|
28
|
-
Won't somebody please have a heart
|
29
|
-
Come and rescue me now 'cause I'm falling apart
|
30
|
-
Under attack, I'm taking cover
|
31
|
-
He's on my track, my chasing lover
|
32
|
-
Thinking nothing's gonna stop him now
|
33
|
-
Should I want to, I'm not sure I won't know how
|
34
|
-
|
35
|
-
Under attack, I'm being taken
|
36
|
-
About to crack, defences breaking
|
37
|
-
Won't somebody see and save a heart
|
38
|
-
Come and rescue me now 'cause I'm falling apart
|
39
|
-
Under attack, I'm taking cover
|
40
|
-
He's on my track, my chasing lover
|
41
|
-
Thinking nothing can stop him now
|
42
|
-
Should I want to, I'm not sure I would know how
|
43
|
-
|
44
|
-
Under attack, I'm being taken
|
45
|
-
About to crack, defences breaking
|
46
|
-
Won't somebody please have a heart
|
47
|
-
Come and rescue me now 'cause I'm falling apart
|
48
|
-
Under attack, I'm taking cover
|