livellosegreto.it is one of the many independent Mastodon servers you can use to participate in the fediverse.
Livello Segreto è il social etico che ha rispetto di te e del tuo tempo.

Administered by:

Server stats:

1.2K
active users

#stt

0 posts0 participants0 posts today
🅹🅴🅳🅸🅴 🇺🇦🕊️<p><a href="https://chaos.social/tags/opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensource</span></a> <a href="https://chaos.social/tags/whisper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>whisper</span></a> als App auf dem Handy ist schon ganz witzig:<br><a href="https://f-droid.org/packages/org.woheller69.whisper/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">f-droid.org/packages/org.wohel</span><span class="invisible">ler69.whisper/</span></a></p><p><a href="https://chaos.social/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a></p>
cs<p>I wrote a <a href="https://mastodon.sdf.org/tags/python" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>python</span></a> script that consumes an RSS feed and downloads an episode of a podcast and creates a transcript. It works pretty well, but it relies upon Whisper. I tried to find other <a href="https://mastodon.sdf.org/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> libraries that will work, but so far everything I have tried has backfired or I just can’t get to work. Suggestions? I thought I was going to get Coqui STT to do it, but I failed. 😞</p>
njoseph :fbx:<p>Does anybody know of a better <a href="https://social.masto.host/tags/speechToText" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speechToText</span></a> alternative to this? </p><p>This feels like a terrible hack that keeps breaking. I decided to look for alternatives after I saw them using /dev/shm to store ML models.</p><p>QuantiusBenignus/BlahST<br><a href="https://github.com/QuantiusBenignus/BlahST" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/QuantiusBenignus/Bl</span><span class="invisible">ahST</span></a></p><p>SpeechNote (aka dsnote) does not qualify since it doesn't integrate with the clipboard.</p><p><a href="https://social.masto.host/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> <a href="https://social.masto.host/tags/WhisperCPP" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WhisperCPP</span></a></p>
☕ 🏳️‍🌈 schlackenfuchs<p><span class="h-card" translate="no"><a href="https://grapheneos.social/@GrapheneOS" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>GrapheneOS</span></a></span> I tried <a href="https://social.tchncs.de/tags/Transcribro" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Transcribro</span></a>, but it seems it understands only English... So: not really a speech-to-text (<a href="https://social.tchncs.de/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a>) <a href="https://social.tchncs.de/tags/keyboard" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>keyboard</span></a> for everyone</p>
Thomas D.<p>There is a new open source STT (speech-to-text) model called "moonshine" that promises superior performance and lower size on embedded devices. I will check it out :) <a href="https://github.com/usefulsensors/moonshine" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/usefulsensors/moons</span><span class="invisible">hine</span></a></p><p>Seems promising.</p><p><a href="https://mastodontech.de/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a> <a href="https://mastodontech.de/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://mastodontech.de/tags/python" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>python</span></a></p>
mmcm<p>I just uninstalled 4 <a href="https://mastodon.social/tags/flatpak" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>flatpak</span></a> apps:</p><p>* <a href="https://mastodon.social/tags/speechNote" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speechNote</span></a> (+AMD addon)<br>* <a href="https://mastodon.social/tags/mongodb" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>mongodb</span></a> compass<br>* <span class="h-card" translate="no"><a href="https://fosstodon.org/@organicmaps" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>organicmaps</span></a></span> <br>* <a href="https://mastodon.social/tags/verso" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>verso</span></a> (which I installed for fun)</p><p>This literated a whopping 52GiB off my system drive. Especially the AMD "addon" with over 12GiB was shocking.</p><p>So, guess I'm in the market for a <a href="https://mastodon.social/tags/linux" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>linux</span></a> <a href="https://mastodon.social/tags/floss" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>floss</span></a> offline-only <a href="https://mastodon.social/tags/whisper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>whisper</span></a> / <a href="https://mastodon.social/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> solution that integrates into a desktop.<br>And for <a href="https://mastodon.social/tags/organicMaps" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>organicMaps</span></a> I guess I'll wait until one day there'll be a .deb. 🙄</p>
njoseph :fbx:<p>Expanded my recent blog post about small AI tools in Debian. Added a section about Firefox extensions for TTS and a section about Optical Character Recognition.</p><p><a href="https://njoseph.me/blog/posts/small-ai-tools-debian/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">njoseph.me/blog/posts/small-ai</span><span class="invisible">-tools-debian/</span></a></p><p><a href="https://social.masto.host/tags/OCR" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OCR</span></a> <a href="https://social.masto.host/tags/Debian" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Debian</span></a> <a href="https://social.masto.host/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> <a href="https://social.masto.host/tags/TTS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TTS</span></a> <a href="https://social.masto.host/tags/ePub" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ePub</span></a></p>
OSTechNix<p>Speech Note – Offline Speech Recognition, Text-to-Speech and Translation App for Linux <a href="https://floss.social/tags/Speechnote" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Speechnote</span></a> <a href="https://floss.social/tags/TextToSpeech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TextToSpeech</span></a> <a href="https://floss.social/tags/SpeechToText" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SpeechToText</span></a> <a href="https://floss.social/tags/Translator" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Translator</span></a> <a href="https://floss.social/tags/TTS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TTS</span></a> <a href="https://floss.social/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> <a href="https://floss.social/tags/Opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Opensource</span></a> <a href="https://floss.social/tags/Linux" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Linux</span></a> <br><a href="https://ostechnix.com/speech-note-speech-recognition-text-to-speech-translation-app-for-linux/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">ostechnix.com/speech-note-spee</span><span class="invisible">ch-recognition-text-to-speech-translation-app-for-linux/</span></a></p>
njoseph :fbx:<p>New blog post!</p><p>Using small AI tools on Debian GNU/Linux<br><a href="https://njoseph.me/blog/posts/small-ai-tools-debian/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">njoseph.me/blog/posts/small-ai</span><span class="invisible">-tools-debian/</span></a></p><p>I am using the word <a href="https://social.masto.host/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> as it is meant to be used, as click-bait. 😉</p><p><a href="https://social.masto.host/tags/accessibility" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>accessibility</span></a> <a href="https://social.masto.host/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://social.masto.host/tags/tts" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tts</span></a></p>
Tykayn<p>Speech to Text — <a href="https://mastodon.cipherbliss.com/tags/Kdenlive" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Kdenlive</span></a> Manual 24.05 documentation<br><a href="https://mastodon.cipherbliss.com/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://mastodon.cipherbliss.com/tags/vosk" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>vosk</span></a> <a href="https://mastodon.cipherbliss.com/tags/transcription" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>transcription</span></a> </p><p><a href="https://docs.kdenlive.org/en/effects_and_compositions/speech_to_text.html" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">docs.kdenlive.org/en/effects_a</span><span class="invisible">nd_compositions/speech_to_text.html</span></a></p>
Kathy Reid<p>Why does <a href="https://aus.social/tags/Microsoft" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Microsoft</span></a> want to implement <a href="https://aus.social/tags/Recall" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Recall</span></a>? It's not about *images*. It's about modelling what workers do on Windows, and then replacing them. </p><p>The most expensive part of a computer is the fallible feelings-filled unpredictable meat sack that operates it. </p><p>Google has YouTube, Google Photos, Maps, and a bucket load of search data, Google Analytics, advertising, as well as it's <a href="https://aus.social/tags/GCP" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GCP</span></a> data (e.g. <a href="https://aus.social/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> transcriptions). And a bunch of data from Android services. From this data they can model speech, model videos and model advertising systems, and how humans respond to them. </p><p>But they can't model what people do on computers. </p><p>Amazon has Prime data, and a bucket load of compute. But no operating system data. They can build models based around e-commerce and advertising systems. </p><p>But they can't model what people do on computers. </p><p>Meta has *waves hands* enough analytics to model human behaviour in the Metaverse. </p><p>But they can't model what people do on computers. </p><p>Microsoft has GitHub. <br>Microsoft has LinkedIn. <br>Microsoft has SharePoint.<br>Microsoft has Teams. <br>Microsoft has Dynamics. <br>Microsoft has O365. <br>Microsoft has Windows telemetry data. </p><p>Microsoft can model what people do on (Windows) computers. Like fill out spreadsheets.Write emails. Synthesize web pages of research. Interact with colleagues on Teams. Create and edit documents. </p><p>Microsoft wants <a href="https://aus.social/tags/MicrosoftRecall" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MicrosoftRecall</span></a> data so they can model what people *do* with operating systems. </p><p>Then replace them. </p><p>Imagine a CoPilot that doesn't just write buggy code. Imagine one that also does spreadsheets. That creates documents on SharePoint. That communicates with colleages on Teams. That has a customer pipeline on Dynamics.</p><p>That's what Recall is about - 360 degree surveillance of the worker, to model their functions, make them fungible, replicable - and replaceable.</p>
Tykayn<p>ça existe pas la saisie vocale sur mobile en dehors des GAFAM ?<br>tout ce que je trouve sur <a href="https://mastodon.cipherbliss.com/tags/fdroid" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fdroid</span></a> c'est du text to speech, alors que je cherche du speech to text. bordel.</p><p><a href="https://mastodon.cipherbliss.com/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://mastodon.cipherbliss.com/tags/tts" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tts</span></a></p>
Scimmia di Mare<p><a href="https://mastodon.uno/tags/UnoAiuto" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>UnoAiuto</span></a> <a href="https://mastodon.uno/tags/MastoAiuto" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MastoAiuto</span></a></p><p>Per <a href="https://mastodon.uno/tags/iOS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>iOS</span></a> o <a href="https://mastodon.uno/tags/OSX" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OSX</span></a> <a href="https://mastodon.uno/tags/Ventura" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Ventura</span></a> qualcuno sa suggerirmi un programma <a href="https://mastodon.uno/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> (trascrizione vocale, da voce a testo)?</p>
DebugPoint - Linux &Dev Portal<p>Introducing "Speech Note": Offline STT, TTS, Translator for your Linux desktop<br><a href="https://www.debugpoint.com/speech-note-text-to-speech/" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">debugpoint.com/speech-note-tex</span><span class="invisible">t-to-speech/</span></a></p><p><a href="https://floss.social/tags/linux" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>linux</span></a> <a href="https://floss.social/tags/opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensource</span></a> <a href="https://floss.social/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://floss.social/tags/tts" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tts</span></a></p>
mkiol<p>If you have to do Speech-to-Text and Text-to-Speech tasks and don't want to send your data to the Internet, I recommend you to try Speech Note (Linux desktop app). </p><p>It is easy to use, works offline and supports 57 languages!</p><p>Speech Note works thanks to powerful <a href="https://mastodon.social/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> and <a href="https://mastodon.social/tags/TTS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TTS</span></a> engines underneath: <a href="https://mastodon.social/tags/DeepSpeech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DeepSpeech</span></a> <a href="https://mastodon.social/tags/Coqui" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Coqui</span></a> <a href="https://mastodon.social/tags/Vosk" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Vosk</span></a> <a href="https://mastodon.social/tags/Whisper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Whisper</span></a> <a href="https://mastodon.social/tags/Piper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Piper</span></a> <a href="https://mastodon.social/tags/eSpeak" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>eSpeak</span></a> <a href="https://mastodon.social/tags/MBROLA" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MBROLA</span></a> <a href="https://mastodon.social/tags/RHVoice" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RHVoice</span></a></p><p>You can download <a href="https://mastodon.social/tags/SpeechNote" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SpeechNote</span></a> from <a href="https://mastodon.social/tags/Flathub" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Flathub</span></a>: <a href="https://flathub.org/apps/net.mkiol.SpeechNote" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">flathub.org/apps/net.mkiol.Spe</span><span class="invisible">echNote</span></a></p><p>Video demo: <a href="https://youtu.be/EhUPvaHvssw" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">youtu.be/EhUPvaHvssw</span><span class="invisible"></span></a></p>
Tykayn<p>vous auriez de bons moteurs de text to speech qui font des voix propres et sans fucking accent anglais ? (du texte transformé en voix, pas l'inverse hein) <br>payant ou non, tant que y'a pas de services de GAFAM dedans et que la voix est vraiment propre sans effet de voix de robot ça m'intéresse.<br><a href="https://mastodon.cipherbliss.com/tags/a11y" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>a11y</span></a> <a href="https://mastodon.cipherbliss.com/tags/accessibilit%C3%A9" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>accessibilité</span></a> <a href="https://mastodon.cipherbliss.com/tags/vocalisation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>vocalisation</span></a> <a href="https://mastodon.cipherbliss.com/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> <a href="https://mastodon.cipherbliss.com/tags/speech2text" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>speech2text</span></a></p>
Garrow Bregenza :lattentacle:<p>This page looks promising some clues <a href="https://docs.getleon.ai/offline" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">docs.getleon.ai/offline</span><span class="invisible"></span></a> <a href="https://mastodon.social/tags/coqui" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>coqui</span></a> <a href="https://mastodon.social/tags/stt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stt</span></a> @ <a href="https://github.com/coqui-ai/STT" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">github.com/coqui-ai/STT</span><span class="invisible"></span></a> and <a href="https://mastodon.social/tags/cmu" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>cmu</span></a> <a href="https://mastodon.social/tags/flite" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>flite</span></a> @ <a href="http://www.festvox.org/flite/" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">http://www.</span><span class="">festvox.org/flite/</span><span class="invisible"></span></a></p>
Marcel Waldvogel<p><span class="h-card"><a href="https://mastodon.social/@MrClicko" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>MrClicko</span></a></span> <a href="https://waldvogel.family/tags/Whisper" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Whisper</span></a> wurde mit fast 700'000 Stunden <a href="https://waldvogel.family/tags/Sprache" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Sprache</span></a> trainiert.</p><p>Das sind über 77 Jahre Geplapper!</p><p>⅔ dieser 681'070 Stunden waren <a href="https://waldvogel.family/tags/Englisch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Englisch</span></a>; <a href="https://waldvogel.family/tags/Deutsch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Deutsch</span></a> nur 2% (<a href="https://waldvogel.family/tags/transkribiert" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>transkribiert</span></a> nach (Hoch-)Deutsch)+½% (<a href="https://waldvogel.family/tags/%C3%BCbersetzt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>übersetzt</span></a> nach Englisch). Ob davon etwas CH-Deutsch war, ist nicht überliefert. Erstaunlich, dass trotz des wenigen Inputs das Modell in Deutsch doch sehr gut und auch in CH-Deutsch noch OK abschneidet. <a href="https://waldvogel.family/tags/STT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>STT</span></a> <a href="https://waldvogel.family/tags/SpeechToText" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SpeechToText</span></a><br><a href="https://arxiv.org/abs/2212.04356" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2212.04356</span><span class="invisible"></span></a></p>