Command line tool to generate audio using SAPI.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
vampi c0d8824b3d prepare release 0.6 3 months ago
.gitignore misc 4 months ago
LICENSE Initial commit 5 months ago misc 4 months ago
add-voice.ps1 add some older files 5 months ago
buildtools.png add some older files 5 months ago
getoptw.c initial separation from ttsservice into a separate project 5 months ago
getoptw.h initial separation from ttsservice into a separate project 5 months ago
list-voices.ps1 add some older files 5 months ago
riffpad.png add some older files 5 months ago
sapicli.cpp only support 16 bit samples for mp3 3 months ago
sapicli.sln cleanup project configuration and building with ogg and vorbis 4 months ago
sapicli.vcxproj add lame mp3 support 3 months ago
sapierr.h add more error strings 5 months ago
test.bat test mp3 3 months ago
zip.bat prepare release 0.6 3 months ago

SAPI command line interface

A simple tool to generate audio from text.

Using getoptW.

Development process

Microsoft can suck a big fat poopy pee pee.


  • Install Visual Studio Build Tools (direct download), and click on "Desktop Development with C++", on the right make sure to check "C++ ATL for latest ...", and you can uncheck "C++ Cmake tools ...", "Testing tools ..." and "C++ AddressSanitizer", to save on space.

Build Tools

After installing, run "Developer Command Prompt for VS 2022" from the start menu, or just hit "Launch" in Visual Studio Installer. cd to the folder where you've cloned this repo, cd to sapicli, and type:

msbuild sapicli.vcxproj -p:Configuration=Release

EVNT chunk

.wav files generated by using SPBindToFile() contain an EVNT chunk, which is a list of serialized events, their structure being that of SPSERIALIZEDEVENT plus any string referenced inside the event itself.

The first byte is the event type, and most events are 24 bytes long. Strings that follow events are in wide char format. String lengths are padded upwards to multiples of 4. So if the string is 126 bytes, it is stored as 128 bytes, with the last two bytes beign zeroes.

EVNT Chunk in RIFFPad

Excerpt from sphelper.h:

* SpSerializedEventSize *
*   Description:
*       Returns the size, in bytes, used by a serialized event.  The caller can
*   pass a pointer to either a SPSERIAILZEDEVENT or SPSERIALIZEDEVENT64 structure.
*   Returns:
*       Number of bytes used by serizlied event

template <class T>
inline ULONG SpSerializedEventSize(const T * pSerEvent)
    ULONG ulSize = sizeof(T);

    if( ( pSerEvent->elParamType == SPET_LPARAM_IS_POINTER ) && pSerEvent->SerializedlParam )
        ulSize += ULONG(pSerEvent->SerializedwParam);
    else if ((pSerEvent->elParamType == SPET_LPARAM_IS_STRING || pSerEvent->elParamType == SPET_LPARAM_IS_TOKEN) &&
             pSerEvent->SerializedlParam != NULL)
        ulSize += ((ULONG)wcslen((WCHAR*)(pSerEvent + 1)) + 1) * sizeof( WCHAR );
    // Round up to nearest DWORD
    ulSize += 3;
    ulSize -= ulSize % 4;
    return ulSize;