chore: update CI ffmpeg to 7.1.1

Merge pull request #8 from spicyjpeg/bs-v3-update
BS v3 support, SPU-ADPCM loop flag fixes and general refactoring
2025-03-08 07:38:56 +01:00 · 2025-03-08 07:21:49 +01:00 · 2025-03-08 01:10:42 +01:00 · 2025-03-05 01:32:35 +01:00 · 2025-03-02 20:15:06 +01:00 · 2025-03-02 12:12:51 +01:00
20 changed files with 2965 additions and 1608 deletions
--- a/.editorconfig
+++ b/.editorconfig
@ -0,0 +1,9 @@
+root = true
+
+[*]
+indent_style             = tab
+indent_size              = 4
+charset                  = utf-8
+end_of_line              = lf
+trim_trailing_whitespace = true
+insert_final_newline     = true
--- a/.github/scripts/build.sh
+++ b/.github/scripts/build.sh
@ -1,7 +1,7 @@
 #!/bin/bash

 ROOT_DIR="$(pwd)"
-FFMPEG_VERSION="6.0"
+FFMPEG_VERSION="7.1.1"
 NUM_JOBS="4"

 if [ $# -eq 1 ]; then
@ -67,6 +67,7 @@ rm -rf ffmpeg-build

 meson setup \
 	--buildtype release \
+	-Db_lto=true \
 	--strip \
 	--prefix $ROOT_DIR/psxavenc-dist \
 	--pkg-config-path $ROOT_DIR/ffmpeg-dist/lib/pkgconfig \
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@ -17,13 +17,14 @@ jobs:
      uses: actions/checkout@v3
      with:
        path: psxavenc
+        fetch-depth: 0

    - name: Build psxavenc for Windows
      run: |
        psxavenc/.github/scripts/build.sh psxavenc-windows x86_64-w64-mingw32 psxavenc/.github/scripts/mingw-cross.txt

    - name: Upload Windows build artifacts
-      uses: actions/upload-artifact@v3
+      uses: actions/upload-artifact@v4
      with:
        name: psxavenc-windows
        path: psxavenc-windows.zip
@ -33,7 +34,7 @@ jobs:
        psxavenc/.github/scripts/build.sh psxavenc-linux

    - name: Upload Linux build artifacts
-      uses: actions/upload-artifact@v3
+      uses: actions/upload-artifact@v4
      with:
        name: psxavenc-linux
        path: psxavenc-linux.zip
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,6 @@
+desktop.ini
+.DS_Store
+.vscode/
+build/
+.cache/
+*.code-workspace
--- a/README.md
+++ b/README.md
@ -2,7 +2,7 @@
 # psxavenc

 psxavenc is an open-source command-line tool for encoding audio and video data
-into formats commonly used on the original PlayStation.
+into formats commonly used on the original PlayStation and PlayStation 2.

 ## Installation

@ -14,22 +14,22 @@ Requirements:

 ```shell
 $ meson setup build
-$ cd build
-$ ninja install
+$ meson compile -C build
+$ meson install -C build
 ```

 ## Usage

-Run `psxavenc`.
+Run `psxavenc -h`.

 ### Examples

 Rescale a video file to ≤320x240 pixels (preserving aspect ratio) and encode it
-into a 15fps .STR file with 37800 Hz 4-bit stereo audio and 2352-byte sectors,
-meant to be played at 2x CD-ROM speed:
+into a 15 fps version 2 .str file with 37800 Hz 4-bit stereo audio and 2352-byte
+sectors, meant to be played at 2x CD-ROM speed:

 ```shell
-$ psxavenc -t str2cd -f 37800 -b 4 -c 2 -s 320x240 -r 15 -x 2 in.mp4 out.str
+$ psxavenc -t strcd -v v2 -f 37800 -b 4 -c 2 -s 320x240 -r 15 -x 2 in.mp4 out.str
 ```

 Convert a mono audio sample to 22050 Hz raw SPU-ADPCM data:
@ -38,35 +38,77 @@ Convert a mono audio sample to 22050 Hz raw SPU-ADPCM data:
 $ psxavenc -t spu -f 22050 in.ogg out.snd
 ```

-Convert a stereo audio file to a 44100 Hz interleaved .VAG file with 8192-byte
+Convert a stereo audio file to a 44100 Hz interleaved .vag file with 2048-byte
 interleave and loop flags set at the end of each interleaved chunk:

 ```shell
-$ psxavenc -t vagi -f 44100 -c 2 -L -i 8192 in.wav out.vag
+$ psxavenc -t vagi -f 44100 -c 2 -L -i 2048 in.wav out.vag
 ```

-## Supported formats
+## Supported output formats

-| Format   | Audio            | Channels | Video | Sector size |
-| :------- | :--------------- | :------- | :---- | :---------- |
-| `xa`     | XA-ADPCM         | 1 or 2   | None  | 2336 bytes  |
-| `xacd`   | XA-ADPCM         | 1 or 2   | None  | 2352 bytes  |
-| `spu`    | SPU-ADPCM        | 1        | None  |             |
-| `spui`   | SPU-ADPCM        | Any      | None  | Any         |
-| `vag`    | SPU-ADPCM        | 1        | None  |             |
-| `vagi`   | SPU-ADPCM        | Any      | None  | Any         |
-| `str2`   | None or XA-ADPCM | 1 or 2   | BS v2 | 2336 bytes  |
-| `str2cd` | None or XA-ADPCM | 1 or 2   | BS v2 | 2352 bytes  |
-| `sbs2`   | None             |          | BS v2 | Any         |
+The output format must be set using the `-t` option.
+
+| Format  | Audio codec          | Audio channels | Video codec   | Sector size |
+| :------ | :------------------- | :------------- | :------------ | :---------- |
+| `xa`    | XA-ADPCM             | 1 or 2         |               | 2336 bytes  |
+| `xacd`  | XA-ADPCM             | 1 or 2         |               | 2352 bytes  |
+| `spu`   | SPU-ADPCM            | 1              |               |             |
+| `vag`   | SPU-ADPCM            | 1              |               |             |
+| `spui`  | SPU-ADPCM            | Any            |               |             |
+| `vagi`  | SPU-ADPCM            | Any            |               |             |
+| `str`   | XA-ADPCM (optional)  | 1 or 2         | BS v2/v3/v3dc | 2336 bytes  |
+| `strcd` | XA-ADPCM (optional)  | 1 or 2         | BS v2/v3/v3dc | 2352 bytes  |
+| `strv`  |                      |                | BS v2/v3/v3dc | 2048 bytes  |
+| `sbs`   |                      |                | BS v2/v3/v3dc |             |

 Notes:

- `vag` and `vagi` are similar to `spu` and `spui` respectively, but add a .VAG
+- The `xa`, `xacd`, `str` and `strcd` formats will output files with 2336- or
+  2352-byte CD-ROM sectors, containing the appropriate CD-XA subheaders and
+  dummy EDC/ECC placeholders in addition to the actual sector data. Such files
+  **cannot be added to a disc image as-is** and must instead be parsed by an
+  authoring tool capable of rebuilding the EDC/ECC data (as it is dependent on
+  the file's absolute location on the disc) and generating a Mode 2 CD-ROM image
+  with "native" 2352-byte sectors.
+- Similarly, files generated with `-t xa` or `-t xacd` **must be interleaved**
+  **with other XA-ADPCM tracks or empty padding using an external tool** before
+  they can be played.
+- `vag` and `vagi` are similar to `spu` and `spui` respectively, but add a .vag
  header at the beginning of the file. The header is always 48 bytes long for
  `vag` files, while in the case of `vagi` files it is padded to the size
  specified using the `-a` option (2048 bytes by default). Note that `vagi`
  files with more than 2 channels and/or alignment other than 2048 bytes are not
  standardized.
- The `sbs2` format (used in some System 573 games) is simply a series of
-  concatenated BS v2 frames, each padded to the size specified by the `-a`
-  option, with no additional headers besides the BS frame headers.
+- ~~The `strspu` format encodes the input file's audio track as a series of~~
+  ~~custom .str chunks (type ID `0x0001` by default) holding interleaved~~
+  ~~SPU-ADPCM data in the same format as `spui`, rather than XA-ADPCM. As .str~~
+  ~~chunks do not require custom XA subheaders, a file with standard 2048-byte~~
+  ~~sectors that does not need any special handling will be generated.~~ *This*
+  *format has not yet been implemented.*
+- The `strv` format disables audio altogether and is equivalent to `strspu` on
+  an input file with no audio track.
+- The `sbs` format (used in some System 573 games) consists of a series of
+  concatenated BS frames, each padded to the size specified by the `-a` option
+  (the default setting is 8192 bytes), with no additional headers besides the BS
+  frame headers.
+
+## Supported video codecs
+
+All formats with a video track (`str`, `strcd`, `strv` and `sbs`) can use any of
+the codecs listed below. The codec can be set using the `-v` option.
+
+| Codec          | Supported by          | Typ. decoder CPU usage |
+| :------------- | :-------------------- | :--------------------- |
+| `v2` (default) | All players/decoders  | Medium                 |
+| `v3`           | Most players/decoders | High                   |
+| `v3dc`         | Few players/decoders  | High                   |
+
+Notes:
+
+- The `v3dc` format is a variant of `v3` with a slightly better compression
+  ratio, however most tools and playback libraries (including FFmpeg, jPSXdec
+  and earlier versions of Sony's own BS decoder) are unable to decode it
+  correctly; its use is thus highly discouraged. Refer to
+  [the psx-spx section on DC coefficient encoding](https://psx-spx.consoledev.net/cdromfileformats/#dc-v3)
+  for more details.
--- a/libpsxav/adpcm.c
+++ b/libpsxav/adpcm.c
@ -29,14 +29,21 @@ freely, subject to the following restrictions:
 #define SHIFT_RANGE_4BPS 12
 #define SHIFT_RANGE_8BPS 8

-#define ADPCM_FILTER_COUNT 5
-#define XA_ADPCM_FILTER_COUNT 4
+#define ADPCM_FILTER_COUNT     5
+#define XA_ADPCM_FILTER_COUNT  4
 #define SPU_ADPCM_FILTER_COUNT 5

 static const int16_t filter_k1[ADPCM_FILTER_COUNT] = {0, 60, 115, 98, 122};
 static const int16_t filter_k2[ADPCM_FILTER_COUNT] = {0, 0, -52, -55, -60};

-static int find_min_shift(const psx_audio_encoder_channel_state_t *state, int16_t *samples, int sample_limit, int pitch, int filter, int shift_range) {
+static int find_min_shift(
+	const psx_audio_encoder_channel_state_t *state,
+	const int16_t *samples,
+	int sample_limit,
+	int pitch,
+	int filter,
+	int shift_range
+) {
 	// Assumption made:
 	//
 	// There is value in shifting right one step further to allow the nibbles to clip.
@ -54,7 +61,7 @@ static int find_min_shift(const psx_audio_encoder_channel_state_t *state, int16_

 	int32_t s_min = 0;
 	int32_t s_max = 0;
-	for (int i = 0; i < 28; i++) {
+	for (int i = 0; i < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; i++) {
 		int32_t raw_sample = (i >= sample_limit) ? 0 : samples[i * pitch];
 		int32_t previous_values = (k1*prev1 + k2*prev2 + (1<<5))>>6;
 		int32_t sample = raw_sample - previous_values;
@ -71,7 +78,19 @@ static int find_min_shift(const psx_audio_encoder_channel_state_t *state, int16_
 	return min_shift;
 }

-static uint8_t attempt_to_encode(psx_audio_encoder_channel_state_t *outstate, const psx_audio_encoder_channel_state_t *instate, int16_t *samples, int sample_limit, int pitch, uint8_t *data, int data_shift, int data_pitch, int filter, int sample_shift, int shift_range) {
+static uint8_t attempt_to_encode(
+	psx_audio_encoder_channel_state_t *outstate,
+	const psx_audio_encoder_channel_state_t *instate,
+	const int16_t *samples,
+	int sample_limit,
+	int pitch,
+	uint8_t *data,
+	int data_shift,
+	int data_pitch,
+	int filter,
+	int sample_shift,
+	int shift_range
+) {
 	uint8_t sample_mask = 0xFFFF >> shift_range;
 	uint8_t nondata_mask = ~(sample_mask << data_shift);

@ -87,7 +106,7 @@ static uint8_t attempt_to_encode(psx_audio_encoder_channel_state_t *outstate, co

 	outstate->mse = 0;

-	for (int i = 0; i < 28; i++) {
+	for (int i = 0; i < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; i++) {
 		int32_t sample = ((i >= sample_limit) ? 0 : samples[i * pitch]) + outstate->qerr;
 		int32_t previous_values = (k1*outstate->prev1 + k2*outstate->prev2 + (1<<5))>>6;
 		int32_t sample_enc = sample - previous_values;
@ -120,8 +139,18 @@ static uint8_t attempt_to_encode(psx_audio_encoder_channel_state_t *outstate, co
 	return hdr;
 }

-static uint8_t encode(psx_audio_encoder_channel_state_t *state, int16_t *samples, int sample_limit, int pitch, uint8_t *data, int data_shift, int data_pitch, int filter_count, int shift_range) {
-    psx_audio_encoder_channel_state_t proposed;
+static uint8_t encode(
+	psx_audio_encoder_channel_state_t *state,
+	const int16_t *samples,
+	int sample_limit,
+	int pitch,
+	uint8_t *data,
+	int data_shift,
+	int data_pitch,
+	int filter_count,
+	int shift_range
+) {
+	psx_audio_encoder_channel_state_t proposed;
 	int64_t best_mse = ((int64_t)1<<(int64_t)50);
 	int best_filter = 0;
 	int best_sample_shift = 0;
@ -161,7 +190,13 @@ static uint8_t encode(psx_audio_encoder_channel_state_t *state, int16_t *samples
 		best_filter, best_sample_shift, shift_range);
 }

-static void encode_block_xa(int16_t *audio_samples, int audio_samples_limit, uint8_t *data, psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state) {
+static void encode_block_xa(
+	const int16_t *audio_samples,
+	int audio_samples_limit,
+	uint8_t *data,
+	psx_audio_xa_settings_t settings,
+	psx_audio_encoder_state_t *state
+) {
 	if (settings.bits_per_sample == 4) {
 		if (settings.stereo) {
 			data[0]  = encode(&(state->left),  audio_samples,            audio_samples_limit,        2, data + 0x10, 0, 4, XA_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS);
@ -205,25 +240,17 @@ uint32_t psx_audio_xa_get_buffer_size(psx_audio_xa_settings_t settings, int samp
 }

 uint32_t psx_audio_spu_get_buffer_size(int sample_count) {
-	return ((sample_count + 27) / 28) << 4;
+	return ((sample_count + PSX_AUDIO_SPU_SAMPLES_PER_BLOCK - 1) / PSX_AUDIO_SPU_SAMPLES_PER_BLOCK) << 4;
 }

 uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings) {
 	return settings.format == PSX_AUDIO_XA_FORMAT_XA ? 2336 : 2352;
 }

-uint32_t psx_audio_spu_get_buffer_size_per_block(void) {
-	return 16;
-}
-
 uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings) {
 	return (((settings.bits_per_sample == 8) ? 112 : 224) >> (settings.stereo ? 1 : 0)) * 18;
 }

-uint32_t psx_audio_spu_get_samples_per_block(void) {
-	return 28;
-}
-
 uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings) {
 	// 1/2 interleave for 37800 Hz 8-bit stereo at 1x speed
 	int interleave = settings.stereo ? 2 : 4;
@ -232,40 +259,60 @@ uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings) {
 	return interleave;
 }

-static void psx_audio_xa_encode_init_sector(uint8_t *buffer, psx_audio_xa_settings_t settings) {
-	if (settings.format == PSX_AUDIO_XA_FORMAT_XACD) {
-		memset(buffer, 0, 2352);
-		memset(buffer+0x001, 0xFF, 10);
-		buffer[0x00F] = 0x02;
-	} else {
-		memset(buffer + 0x10, 0, 2336);
-	}
-
-	buffer[0x010] = settings.file_number;
-	buffer[0x011] = settings.channel_number & 0x1F;
-	buffer[0x012] = 0x24 | 0x40;
-	buffer[0x013] =
-		(settings.stereo ? 1 : 0)
-		| (settings.frequency >= PSX_AUDIO_XA_FREQ_DOUBLE ? 0 : 4)
-		| (settings.bits_per_sample >= 8 ? 16 : 0);
-	memcpy(buffer + 0x014, buffer + 0x010, 4);
+static inline void psx_audio_xa_sync_subheader_copy(psx_cdrom_sector_mode2_t *buffer) {
+	memcpy(buffer->subheader + 1, buffer->subheader, sizeof(psx_cdrom_sector_xa_subheader_t));
 }

-int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state, int16_t* samples, int sample_count, uint8_t *output) {
+static void psx_audio_xa_encode_init_sector(psx_cdrom_sector_mode2_t *buffer, int lba, psx_audio_xa_settings_t settings) {
+	if (settings.format == PSX_AUDIO_XA_FORMAT_XACD)
+		psx_cdrom_init_sector((psx_cdrom_sector_t *)buffer, lba, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2);
+
+	buffer->subheader[0].file = settings.file_number;
+	buffer->subheader[0].channel = settings.channel_number & PSX_CDROM_SECTOR_XA_CHANNEL_MASK;
+	buffer->subheader[0].submode =
+		PSX_CDROM_SECTOR_XA_SUBMODE_AUDIO
+		| PSX_CDROM_SECTOR_XA_SUBMODE_FORM2
+		| PSX_CDROM_SECTOR_XA_SUBMODE_RT;
+
+	if (settings.stereo)
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_STEREO;
+	else
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_MONO;
+	if (settings.frequency == PSX_AUDIO_XA_FREQ_DOUBLE)
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_FREQ_DOUBLE;
+	else
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_FREQ_SINGLE;
+	if (settings.bits_per_sample == 8)
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_BITS_8;
+	else
+		buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_BITS_4;
+
+	psx_audio_xa_sync_subheader_copy(buffer);
+}
+
+int psx_audio_xa_encode(
+	psx_audio_xa_settings_t settings,
+	psx_audio_encoder_state_t *state,
+	const int16_t *samples,
+	int sample_count,
+	int lba,
+	uint8_t *output
+) {
 	int sample_jump = (settings.bits_per_sample == 8) ? 112 : 224;
 	int i, j;
-	int xa_sector_size = settings.format == PSX_AUDIO_XA_FORMAT_XA ? 2336 : 2352;
-	int xa_offset = 2352 - xa_sector_size;
+	int xa_sector_size = psx_audio_xa_get_buffer_size_per_sector(settings);
+	int xa_offset = PSX_CDROM_SECTOR_SIZE - xa_sector_size;
 	uint8_t init_sector = 1;

-	if (settings.stereo) { sample_count <<= 1; }
-	
+	if (settings.stereo)
+		sample_count *= 2;
+
 	for (i = 0, j = 0; i < sample_count || ((j % 18) != 0); i += sample_jump, j++) {
-		uint8_t *sector_data = output + ((j/18) * xa_sector_size) - xa_offset;
-		uint8_t *block_data = sector_data + 0x18 + ((j%18) * 0x80);
+		psx_cdrom_sector_mode2_t *sector_data = (psx_cdrom_sector_mode2_t*) (output + ((j/18) * xa_sector_size) - xa_offset);
+		uint8_t *block_data = sector_data->data + ((j%18) * 0x80);

 		if (init_sector) {
-			psx_audio_xa_encode_init_sector(sector_data, settings);
+			psx_audio_xa_encode_init_sector(sector_data, lba, settings);
 			init_sector = 0;
 		}

@ -275,8 +322,9 @@ int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_stat
 		memcpy(block_data + 12, block_data + 8, 4);

 		if ((j+1)%18 == 0) {
-			psx_cdrom_calculate_checksums(sector_data, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2);
+			psx_cdrom_calculate_checksums((psx_cdrom_sector_t *)sector_data, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2);
 			init_sector = 1;
+			lba++;
 		}
 	}

@ -285,28 +333,41 @@ int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_stat

 void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length) {
 	if (output_length >= 2336) {
-		output[output_length - 2352 + 0x12] |= 0x80;
-		output[output_length - 2352 + 0x18] |= 0x80;
+		psx_cdrom_sector_mode2_t *sector = (psx_cdrom_sector_mode2_t*) &output[output_length - PSX_CDROM_SECTOR_SIZE];
+		sector->subheader[0].submode |= PSX_CDROM_SECTOR_XA_SUBMODE_EOF;
+		psx_audio_xa_sync_subheader_copy(sector);
 	}
 }

-int psx_audio_xa_encode_simple(psx_audio_xa_settings_t settings, int16_t* samples, int sample_count, uint8_t *output) {
+int psx_audio_xa_encode_simple(
+	psx_audio_xa_settings_t settings,
+	const int16_t *samples,
+	int sample_count,
+	int lba,
+	uint8_t *output
+) {
 	psx_audio_encoder_state_t state;
 	memset(&state, 0, sizeof(psx_audio_encoder_state_t));
-	int length = psx_audio_xa_encode(settings, &state, samples, sample_count, output);
+	int length = psx_audio_xa_encode(settings, &state, samples, sample_count, lba, output);
 	psx_audio_xa_encode_finalize(settings, output, length);
 	return length;
 }

-int psx_audio_spu_encode(psx_audio_encoder_channel_state_t *state, int16_t* samples, int sample_count, int pitch, uint8_t *output) {
-	uint8_t prebuf[28];
+int psx_audio_spu_encode(
+	psx_audio_encoder_channel_state_t *state,
+	const int16_t *samples,
+	int sample_count,
+	int pitch,
+	uint8_t *output
+) {
+	uint8_t prebuf[PSX_AUDIO_SPU_SAMPLES_PER_BLOCK];
 	uint8_t *buffer = output;

-	for (int i = 0; i < sample_count; i += 28, buffer += 16) {
+	for (int i = 0; i < sample_count; i += PSX_AUDIO_SPU_SAMPLES_PER_BLOCK, buffer += PSX_AUDIO_SPU_BLOCK_SIZE) {
 		buffer[0] = encode(state, samples + i * pitch, sample_count - i, pitch, prebuf, 0, 1, SPU_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS);
 		buffer[1] = 0;

-		for (int j = 0; j < 28; j+=2) {
+		for (int j = 0; j < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; j+=2) {
 			buffer[2 + (j>>1)] = (prebuf[j] & 0x0F) | (prebuf[j+1] << 4);
 		}
 	}
@ -314,29 +375,29 @@ int psx_audio_spu_encode(psx_audio_encoder_channel_state_t *state, int16_t* samp
 	return buffer - output;
 }

-int psx_audio_spu_encode_simple(int16_t* samples, int sample_count, uint8_t *output, int loop_start) {
+int psx_audio_spu_encode_simple(const int16_t *samples, int sample_count, uint8_t *output, int loop_start) {
 	psx_audio_encoder_channel_state_t state;
 	memset(&state, 0, sizeof(psx_audio_encoder_channel_state_t));
 	int length = psx_audio_spu_encode(&state, samples, sample_count, 1, output);

-	if (length >= 32) {
+	if (length >= PSX_AUDIO_SPU_BLOCK_SIZE) {
+		uint8_t *last_block = output + length - PSX_AUDIO_SPU_BLOCK_SIZE;
+
 		if (loop_start < 0) {
-			//output[1] = PSX_AUDIO_SPU_LOOP_START;
-			output[length - 16 + 1] = PSX_AUDIO_SPU_LOOP_END;
+			last_block[1] |= PSX_AUDIO_SPU_LOOP_END;
+
+			// Insert trailing looping block
+			memset(output + length, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
+			output[length + 1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
+
+			length += PSX_AUDIO_SPU_BLOCK_SIZE;
 		} else {
-			psx_audio_spu_set_flag_at_sample(output, loop_start, PSX_AUDIO_SPU_LOOP_START);
-			output[length - 16 + 1] = PSX_AUDIO_SPU_LOOP_REPEAT;
+			int loop_start_offset = loop_start / PSX_AUDIO_SPU_SAMPLES_PER_BLOCK * PSX_AUDIO_SPU_BLOCK_SIZE;
+
+			last_block[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
+			output[loop_start_offset + 1] |= PSX_AUDIO_SPU_LOOP_START;
 		}
-	} else if (length >= 16) {
-		output[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
-		if (loop_start >= 0)
-			output[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
 	}

 	return length;
 }
-
-void psx_audio_spu_set_flag_at_sample(uint8_t* spu_data, int sample_pos, int flag) {
-	int buffer_pos = (sample_pos / 28) << 4;
-	spu_data[buffer_pos + 1] = flag;
-}
--- a/libpsxav/cdrom.c
+++ b/libpsxav/cdrom.c
@ -21,49 +21,91 @@ freely, subject to the following restrictions:
 3. This notice may not be removed or altered from any source distribution.
 */

+#include <stdint.h>
 #include <string.h>
 #include "libpsxav.h"

-static uint32_t psx_cdrom_calculate_edc(uint8_t *sector, uint32_t offset, uint32_t size)
-{
+#define EDC_CRC32_POLYNOMIAL 0xD8018001
+
+static uint32_t edc_crc32(uint8_t *data, int length) {
 	uint32_t edc = 0;
-	for (int i = offset; i < offset+size; i++) {
-		edc ^= 0xFF&(uint32_t)sector[i];
-		for (int ibit = 0; ibit < 8; ibit++) {
-			edc = (edc>>1)^(0xD8018001*(edc&0x1));
-		}
+
+	for (int i = 0; i < length; i++) {
+		edc ^= 0xFF & (uint32_t)data[i];
+
+		for (int j = 0; j < 8; j++)
+			edc = (edc >> 1) ^ (EDC_CRC32_POLYNOMIAL * (edc & 0x1));
 	}
+
 	return edc;
 }

-void psx_cdrom_calculate_checksums(uint8_t *sector, psx_cdrom_sector_type_t type)
-{
-	switch (type) {
-		case PSX_CDROM_SECTOR_TYPE_MODE1: {
-			uint32_t edc = psx_cdrom_calculate_edc(sector, 0x0, 0x810);
-			sector[0x810] = (uint8_t)(edc);
-			sector[0x811] = (uint8_t)(edc >> 8);
-			sector[0x812] = (uint8_t)(edc >> 16);
-			sector[0x813] = (uint8_t)(edc >> 24);
+#define TO_BCD(x) ((x) + ((x) / 10) * 6)

+void psx_cdrom_init_xa_subheader(psx_cdrom_sector_xa_subheader_t *subheader, psx_cdrom_sector_type_t type) {
+	memset(subheader, 0, sizeof(psx_cdrom_sector_xa_subheader_t) * 2);
+	subheader->submode = PSX_CDROM_SECTOR_XA_SUBMODE_DATA;
+
+	if (type == PSX_CDROM_SECTOR_TYPE_MODE2_FORM2)
+		subheader->submode |= PSX_CDROM_SECTOR_XA_SUBMODE_FORM2;
+
+	memcpy(subheader + 1, subheader, sizeof(psx_cdrom_sector_xa_subheader_t));
+}
+
+void psx_cdrom_init_sector(psx_cdrom_sector_t *sector, int lba, psx_cdrom_sector_type_t type) {
+	// Sync sequence
+	memset(sector->mode1.sync + 1, 0xff, 10);
+	sector->mode1.sync[0x0] = 0x00;
+	sector->mode1.sync[0xB] = 0x00;
+
+	// Timecode
+	lba += 150;
+	sector->mode1.header.minute = TO_BCD(lba / 4500);
+	sector->mode1.header.second = TO_BCD((lba / 75) % 60);
+	sector->mode1.header.sector = TO_BCD(lba % 75);
+
+	// Mode
+	if (type == PSX_CDROM_SECTOR_TYPE_MODE1) {
+		sector->mode1.header.mode = 0x01;
+	} else {
+		sector->mode2.header.mode = 0x02;
+		psx_cdrom_init_xa_subheader(sector->mode2.subheader, type);
+	}
+}
+
+void psx_cdrom_calculate_checksums(psx_cdrom_sector_t *sector, psx_cdrom_sector_type_t type) {
+	uint8_t *data = (uint8_t *)sector;
+	uint32_t edc;
+
+	switch (type) {
+		case PSX_CDROM_SECTOR_TYPE_MODE1:
+			edc = edc_crc32(data, 0x810);
+
+			data[0x810] = (uint8_t)(edc);
+			data[0x811] = (uint8_t)(edc >> 8);
+			data[0x812] = (uint8_t)(edc >> 16);
+			data[0x813] = (uint8_t)(edc >> 24);
 			memset(sector + 0x814, 0, 8);
 			// TODO: ECC
-		} break;
-		case PSX_CDROM_SECTOR_TYPE_MODE2_FORM1: {
-			uint32_t edc = psx_cdrom_calculate_edc(sector, 0x10, 0x808);
-			sector[0x818] = (uint8_t)(edc);
-			sector[0x819] = (uint8_t)(edc >> 8);
-			sector[0x81A] = (uint8_t)(edc >> 16);
-			sector[0x81B] = (uint8_t)(edc >> 24);
+			break;

+		case PSX_CDROM_SECTOR_TYPE_MODE2_FORM1:
+			edc = edc_crc32(data + 0x10, 0x808);
+
+			data[0x818] = (uint8_t)(edc);
+			data[0x819] = (uint8_t)(edc >> 8);
+			data[0x81A] = (uint8_t)(edc >> 16);
+			data[0x81B] = (uint8_t)(edc >> 24);
 			// TODO: ECC
-		} break;
-		case PSX_CDROM_SECTOR_TYPE_MODE2_FORM2: {
-			uint32_t edc = psx_cdrom_calculate_edc(sector, 0x10, 0x91C);
-			sector[0x92C] = (uint8_t)(edc);
-			sector[0x92D] = (uint8_t)(edc >> 8);
-			sector[0x92E] = (uint8_t)(edc >> 16);
-			sector[0x92F] = (uint8_t)(edc >> 24);
-		} break;
+			break;
+
+		case PSX_CDROM_SECTOR_TYPE_MODE2_FORM2:
+			edc = edc_crc32(data + 0x10, 0x91C);
+
+			data[0x92C] = (uint8_t)(edc);
+			data[0x92D] = (uint8_t)(edc >> 8);
+			data[0x92E] = (uint8_t)(edc >> 16);
+			data[0x92F] = (uint8_t)(edc >> 24);
+			break;
 	}
-}
+}
--- a/libpsxav/libpsxav.h
+++ b/libpsxav/libpsxav.h
@ -21,16 +21,20 @@ freely, subject to the following restrictions:
 3. This notice may not be removed or altered from any source distribution.
 */

-#ifndef __LIBPSXAV_H__
-#define __LIBPSXAV_H__
+#pragma once

 #include <stdbool.h>
 #include <stdint.h>

 // audio.c

-#define PSX_AUDIO_XA_FREQ_SINGLE 18900
-#define PSX_AUDIO_XA_FREQ_DOUBLE 37800
+#define PSX_AUDIO_SPU_BLOCK_SIZE        16
+#define PSX_AUDIO_SPU_SAMPLES_PER_BLOCK 28
+
+enum {
+	PSX_AUDIO_XA_FREQ_SINGLE = 18900,
+	PSX_AUDIO_XA_FREQ_DOUBLE = 37800
+};

 typedef enum {
 	PSX_AUDIO_XA_FORMAT_XA, // .xa file
@ -57,34 +61,113 @@ typedef struct {
 	psx_audio_encoder_channel_state_t right;
 } psx_audio_encoder_state_t;

-#define PSX_AUDIO_SPU_LOOP_END 1
-#define PSX_AUDIO_SPU_LOOP_REPEAT 3
-#define PSX_AUDIO_SPU_LOOP_START 4
+enum {
+	PSX_AUDIO_SPU_LOOP_END    = 1 << 0,
+	PSX_AUDIO_SPU_LOOP_REPEAT = 3 << 0,
+	PSX_AUDIO_SPU_LOOP_START  = 1 << 2
+};

 uint32_t psx_audio_xa_get_buffer_size(psx_audio_xa_settings_t settings, int sample_count);
 uint32_t psx_audio_spu_get_buffer_size(int sample_count);
 uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings);
-uint32_t psx_audio_spu_get_buffer_size_per_block(void);
 uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings);
-uint32_t psx_audio_spu_get_samples_per_block(void);
 uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings);
-int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state, int16_t* samples, int sample_count, uint8_t *output);
-int psx_audio_xa_encode_simple(psx_audio_xa_settings_t settings, int16_t* samples, int sample_count, uint8_t *output);
-int psx_audio_spu_encode(psx_audio_encoder_channel_state_t *state, int16_t* samples, int sample_count, int pitch, uint8_t *output);
-int psx_audio_spu_encode_simple(int16_t* samples, int sample_count, uint8_t *output, int loop_start);
+int psx_audio_xa_encode(
+	psx_audio_xa_settings_t settings,
+	psx_audio_encoder_state_t *state,
+	const int16_t *samples,
+	int sample_count,
+	int lba,
+	uint8_t *output
+);
+int psx_audio_xa_encode_simple(
+	psx_audio_xa_settings_t settings,
+	const int16_t *samples,
+	int sample_count,
+	int lba,
+	uint8_t *output
+);
+int psx_audio_spu_encode(
+	psx_audio_encoder_channel_state_t *state,
+	const int16_t *samples,
+	int sample_count,
+	int pitch,
+	uint8_t *output
+);
+int psx_audio_spu_encode_simple(const int16_t *samples, int sample_count, uint8_t *output, int loop_start);
 void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length);
-void psx_audio_spu_set_flag_at_sample(uint8_t* spu_data, int sample_pos, int flag);

 // cdrom.c

 #define PSX_CDROM_SECTOR_SIZE 2352

+typedef struct {
+	uint8_t minute;
+	uint8_t second;
+	uint8_t sector;
+	uint8_t mode;
+} psx_cdrom_sector_header_t;
+
+typedef struct {
+	uint8_t file;
+	uint8_t channel;
+	uint8_t submode;
+	uint8_t coding;
+} psx_cdrom_sector_xa_subheader_t;
+
+typedef struct {
+	uint8_t sync[12];
+	psx_cdrom_sector_header_t header;
+	uint8_t data[0x920];
+} psx_cdrom_sector_mode1_t;
+
+typedef struct {
+	uint8_t sync[12];
+	psx_cdrom_sector_header_t header;
+	psx_cdrom_sector_xa_subheader_t subheader[2];
+	uint8_t data[0x918];
+} psx_cdrom_sector_mode2_t;
+
+typedef union {
+	psx_cdrom_sector_mode1_t mode1;
+	psx_cdrom_sector_mode2_t mode2;
+} psx_cdrom_sector_t;
+
+_Static_assert(sizeof(psx_cdrom_sector_mode1_t) == PSX_CDROM_SECTOR_SIZE, "Invalid Mode1 sector size");
+_Static_assert(sizeof(psx_cdrom_sector_mode2_t) == PSX_CDROM_SECTOR_SIZE, "Invalid Mode2 sector size");
+
+#define PSX_CDROM_SECTOR_XA_CHANNEL_MASK 0x1F
+
+enum {
+	PSX_CDROM_SECTOR_XA_SUBMODE_EOR     = 1 << 0,
+	PSX_CDROM_SECTOR_XA_SUBMODE_VIDEO   = 1 << 1,
+	PSX_CDROM_SECTOR_XA_SUBMODE_AUDIO   = 1 << 2,
+	PSX_CDROM_SECTOR_XA_SUBMODE_DATA    = 1 << 3,
+	PSX_CDROM_SECTOR_XA_SUBMODE_TRIGGER = 1 << 4,
+	PSX_CDROM_SECTOR_XA_SUBMODE_FORM2   = 1 << 5,
+	PSX_CDROM_SECTOR_XA_SUBMODE_RT      = 1 << 6,
+	PSX_CDROM_SECTOR_XA_SUBMODE_EOF     = 1 << 7
+};
+
+enum {
+	PSX_CDROM_SECTOR_XA_CODING_MONO         = 0 << 0,
+	PSX_CDROM_SECTOR_XA_CODING_STEREO       = 1 << 0,
+	PSX_CDROM_SECTOR_XA_CODING_CHANNEL_MASK = 3 << 0,
+	PSX_CDROM_SECTOR_XA_CODING_FREQ_DOUBLE  = 0 << 2,
+	PSX_CDROM_SECTOR_XA_CODING_FREQ_SINGLE  = 1 << 2,
+	PSX_CDROM_SECTOR_XA_CODING_FREQ_MASK    = 3 << 2,
+	PSX_CDROM_SECTOR_XA_CODING_BITS_4       = 0 << 4,
+	PSX_CDROM_SECTOR_XA_CODING_BITS_8       = 1 << 4,
+	PSX_CDROM_SECTOR_XA_CODING_BITS_MASK    = 3 << 4,
+	PSX_CDROM_SECTOR_XA_CODING_EMPHASIS     = 1 << 6
+};
+
 typedef enum {
 	PSX_CDROM_SECTOR_TYPE_MODE1,
 	PSX_CDROM_SECTOR_TYPE_MODE2_FORM1,
 	PSX_CDROM_SECTOR_TYPE_MODE2_FORM2
 } psx_cdrom_sector_type_t;

-void psx_cdrom_calculate_checksums(uint8_t *sector, psx_cdrom_sector_type_t type);
-
-#endif /* __LIBPSXAV_H__ */
+void psx_cdrom_init_xa_subheader(psx_cdrom_sector_xa_subheader_t *subheader, psx_cdrom_sector_type_t type);
+void psx_cdrom_init_sector(psx_cdrom_sector_t *sector, int lba, psx_cdrom_sector_type_t type);
+void psx_cdrom_calculate_checksums(psx_cdrom_sector_t *sector, psx_cdrom_sector_type_t type);
--- a/meson.build
+++ b/meson.build
@ -1,28 +1,32 @@
 project('psxavenc', 'c', default_options: ['c_std=c11'])

-add_project_arguments('-D_POSIX_C_SOURCE=201112L', language : 'c')
+add_project_arguments('-D_POSIX_C_SOURCE=201112L', '-ffast-math', language : 'c')
+
+conf_data = configuration_data()
+conf_data.set('VERSION', '"' + run_command('git', '-C', meson.project_source_root(), 'describe', '--tags', '--always', '--dirty', '--match=v*', check: true).stdout().strip() + '"')
+configure_file(output: 'config.h', configuration: conf_data)

 libm_dep = meson.get_compiler('c').find_library('m')

 ffmpeg = [
-  dependency('libavformat'),
-  dependency('libavcodec'),
-  dependency('libavutil'),
-  dependency('libswresample'),
-  dependency('libswscale')
+	dependency('libavformat'),
+	dependency('libavcodec'),
+	dependency('libavutil'),
+	dependency('libswresample'),
+	dependency('libswscale')
 ]

 libpsxav = static_library('psxav', [
-  'libpsxav/adpcm.c',
-  'libpsxav/cdrom.c',
-  'libpsxav/libpsxav.h'
+	'libpsxav/adpcm.c',
+	'libpsxav/cdrom.c',
+	'libpsxav/libpsxav.h'
 ])
 libpsxav_dep = declare_dependency(include_directories: include_directories('libpsxav'), link_with: libpsxav)

 executable('psxavenc', [
-  'psxavenc/cdrom.c',
-  'psxavenc/decoding.c',
-  'psxavenc/filefmt.c',
-  'psxavenc/mdec.c',
-  'psxavenc/psxavenc.c'
+	'psxavenc/args.c',
+	'psxavenc/decoding.c',
+	'psxavenc/filefmt.c',
+	'psxavenc/main.c',
+	'psxavenc/mdec.c'
 ], dependencies: [libm_dep, ffmpeg, libpsxav_dep], install: true)
--- a/psxavenc/args.c
+++ b/psxavenc/args.c
@ -0,0 +1,722 @@
+/*
+psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
+
+Copyright (c) 2019, 2020 Adrian "asie" Siekierka
+Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023, 2025 spicyjpeg
+
+This software is provided 'as-is', without any express or implied
+warranty. In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
+*/
+
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include "args.h"
+#include "config.h"
+
+#define INVALID_PARAM -1
+
+static int parse_int(
+	int *output,
+	const char *name,
+	const char *value,
+	int min_value,
+	int max_value
+) {
+	if (value == NULL) {
+		fprintf(stderr, "Missing %s value after option\n", name);
+		return INVALID_PARAM;
+	}
+
+	*output = strtol(value, NULL, 0);
+
+	if (
+		(*output < min_value) ||
+		(max_value >= 0 && *output > max_value)
+	) {
+		if (max_value >= 0)
+			fprintf(stderr, "Invalid %s: %d (must be in %d-%d range)\n", name, *output, min_value, max_value);
+		else
+			fprintf(stderr, "Invalid %s: %d (must be %d or greater)\n", name, *output, min_value);
+		return INVALID_PARAM;
+	}
+
+	return 2;
+}
+
+static int parse_int_one_of(
+	int *output,
+	const char *name,
+	const char *value,
+	int value_a,
+	int value_b
+) {
+	if (value == NULL) {
+		fprintf(stderr, "Missing %s value after option\n", name);
+		return INVALID_PARAM;
+	}
+
+	*output = strtol(value, NULL, 0);
+
+	if (*output != value_a && *output != value_b) {
+		fprintf(stderr, "Invalid %s: %d (must be %d or %d)\n", name, *output, value_a, value_b);
+		return INVALID_PARAM;
+	}
+
+	return 2;
+}
+
+static int parse_enum(
+	int *output,
+	const char *name,
+	const char *value,
+	const char *const *choices,
+	int count
+) {
+	if (value == NULL) {
+		fprintf(stderr, "Missing %s value after option\n", name);
+		return INVALID_PARAM;
+	}
+	for (int i = 0; i < count; i++) {
+		if (strcmp(value, choices[i]) == 0) {
+			*output = i;
+			return 2;
+		}
+	}
+
+	fprintf(
+		stderr,
+		"Invalid %s: %s\n"
+		"Must be one of the following values:\n",
+		name,
+		value
+	);
+	for (int i = 0; i < count; i++)
+		fprintf(stderr, "    %s\n", choices[i]);
+	return INVALID_PARAM;
+}
+
+static const char *const general_options_help =
+	"General options:\n"
+	"    -h                Show this help message and exit\n"
+	"    -V                Show version information and exit\n"
+	"    -q                Suppress all non-error messages\n"
+	"    -t format         Use (or show help for) specified output format\n"
+	"                        xa:     [A.] XA-ADPCM, 2336-byte sectors\n"
+	"                        xacd:   [A.] XA-ADPCM, 2352-byte sectors\n"
+	"                        spu:    [A.] raw SPU-ADPCM mono data\n"
+	"                        spui:   [A.] raw SPU-ADPCM interleaved data\n"
+	"                        vag:    [A.] .vag SPU-ADPCM mono\n"
+	"                        vagi:   [A.] .vag SPU-ADPCM interleaved\n"
+	"                        str:    [AV] .str video + XA-ADPCM, 2336-byte sectors\n"
+	"                        strcd:  [AV] .str video + XA-ADPCM, 2352-byte sectors\n"
+	//"                        strspu: [AV] .str video + SPU-ADPCM, 2048-byte sectors\n"
+	"                        strv:   [.V] .str video, 2048-byte sectors\n"
+	"                        sbs:    [.V] .sbs video\n"
+	"    -R key=value,...  Pass custom options to libswresample (see FFmpeg docs)\n"
+	"    -S key=value,...  Pass custom options to libswscale (see FFmpeg docs)\n"
+	"\n";
+
+static const char *const format_names[NUM_FORMATS] = {
+	"xa",
+	"xacd",
+	"spu",
+	"vag",
+	"spui",
+	"vagi",
+	"str",
+	"strcd",
+	"strspu",
+	"strv",
+	"sbs"
+};
+
+static void init_default_args(args_t *args) {
+	if (
+		args->format == FORMAT_XA ||
+		args->format == FORMAT_XACD ||
+		args->format == FORMAT_STR ||
+		args->format == FORMAT_STRCD
+	)
+		args->audio_frequency = 37800;
+	else
+		args->audio_frequency = 44100;
+
+	if (args->format == FORMAT_SPU || args->format == FORMAT_VAG)
+		args->audio_channels = 1;
+	else
+		args->audio_channels = 2;
+
+	args->audio_bit_depth = 4;
+	args->audio_xa_file = 0;
+	args->audio_xa_channel = 0;
+	args->audio_interleave = 2048;
+	args->audio_loop_point = -1;
+
+	args->video_codec = BS_CODEC_V2;
+	args->video_width = 320;
+	args->video_height = 240;
+
+	args->str_fps_num = 15;
+	args->str_fps_den = 1;
+	args->str_cd_speed = 2;
+	args->str_video_id = 0x8001;
+	args->str_audio_id = 0x0001;
+
+	if (args->format == FORMAT_SPU || args->format == FORMAT_VAG)
+		args->alignment = 64; // Default SPU DMA chunk size
+	else if (args->format == FORMAT_SBS)
+		args->alignment = 8192; // Default for System 573 games
+	else
+		args->alignment = 2048;
+}
+
+static int parse_general_option(args_t *args, char option, const char *param) {
+	int parsed;
+
+	switch (option) {
+		case '-':
+			args->flags |= FLAG_IGNORE_OPTIONS;
+			return 1;
+
+		case 'h':
+			args->flags |= FLAG_PRINT_HELP;
+			return 1;
+
+		case 'V':
+			args->flags |= FLAG_PRINT_VERSION;
+			return 1;
+
+		case 'q':
+			args->flags |= FLAG_QUIET | FLAG_HIDE_PROGRESS;
+			return 1;
+
+		case 't':
+			parsed = parse_enum(&(args->format), "format", param, format_names, NUM_FORMATS);
+			if (parsed > 0)
+				init_default_args(args);
+			return parsed;
+
+		case 'R':
+			if (param == NULL) {
+				fprintf(stderr, "Missing libswresample parameter list after option\n");
+				return INVALID_PARAM;
+			}
+
+			args->swresample_options = param;
+			return 2;
+
+		case 'S':
+			if (param == NULL) {
+				fprintf(stderr, "Missing libswscale parameter list after option\n");
+				return INVALID_PARAM;
+			}
+
+			args->swscale_options = param;
+			return 2;
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const xa_options_help =
+	"XA-ADPCM options:\n"
+	"    [-f 18900|37800] [-c 1|2] [-b 4|8] [-F 0-255] [-C 0-31]\n"
+	"\n"
+	"    -f 18900|37800    Use specified sample rate (default 37800)\n"
+	"    -c 1|2            Use specified channel count (default 2)\n"
+	"    -b 4|8            Use specified bit depth (default 4)\n"
+	"    -F 0-255          Set CD-XA file number (for both audio and video, default 0)\n"
+	"    -C 0-31           Set CD-XA channel number (for both audio and video, default 0)\n"
+	"\n";
+
+static int parse_xa_option(args_t *args, char option, const char *param) {
+	switch (option) {
+		case 'f':
+			return parse_int_one_of(&(args->audio_frequency), "sample rate", param, 18900, 37800);
+
+		case 'c':
+			return parse_int_one_of(&(args->audio_channels), "channel count", param, 1, 2);
+
+		case 'b':
+			return parse_int_one_of(&(args->audio_bit_depth), "bit depth", param, 4, 8);
+
+		case 'F':
+			return parse_int(&(args->audio_xa_file), "file number", param, 0, 255);
+
+		case 'C':
+			return parse_int(&(args->audio_xa_channel), "channel number", param, 0, 31);
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const spu_options_help =
+	"Mono SPU-ADPCM options:\n"
+	"    [-f freq] [-a size] [-l ms | -L] [-D]\n"
+	"\n"
+	"    -f freq           Use specified sample rate (default 44100)\n"
+	"    -a size           Pad audio data excluding header to multiple of given size (default 64)\n"
+	"    -l ms             Add loop point at specified offset (in milliseconds)\n"
+	"    -L                Set loop end flag at the end of data but do not add a loop point\n"
+	"    -D                Do not prepend encoded data with a dummy silent block\n"
+	"\n";
+
+static int parse_spu_option(args_t *args, char option, const char *param) {
+	switch (option) {
+		case 'f':
+			return parse_int(&(args->audio_frequency), "sample rate", param, 1, -1);
+
+		case 'a':
+			return parse_int(&(args->alignment), "alignment", param, 1, -1);
+
+		case 'l':
+			args->flags |= FLAG_SPU_LOOP_END;
+			return parse_int(&(args->audio_loop_point), "loop offset", param, 0, -1);
+
+		case 'L':
+			args->flags |= FLAG_SPU_LOOP_END;
+			return 1;
+
+		case 'D':
+			args->flags |= FLAG_SPU_NO_LEADING_DUMMY;
+			return 1;
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const spui_options_help =
+	"Interleaved SPU-ADPCM options:\n"
+	"    [-f freq] [-c channels] [-i size] [-a size] [-L] [-D]\n"
+	"\n"
+	"    -f freq           Use specified sample rate (default 44100)\n"
+	"    -c channels       Use specified channel count (default 2)\n"
+	"    -i size           Use specified channel interleave size (default 2048)\n"
+	"    -a size           Pad .vag header and each audio chunk to multiples of given size\n"
+	"                      (default 2048)\n"
+	"    -L                Set loop end flag at the end of each audio chunk\n"
+	"    -D                Do not prepend first chunk's data with a dummy silent block\n"
+	"\n";
+
+static int parse_spui_option(args_t *args, char option, const char *param) {
+	int parsed;
+
+	switch (option) {
+		case 'f':
+			return parse_int(&(args->audio_frequency), "sample rate", param, 1, -1);
+
+		case 'c':
+			return parse_int(&(args->audio_channels), "channel count", param, 1, -1);
+
+		case 'i':
+			parsed = parse_int(&(args->audio_interleave), "interleave", param, 16, -1);
+
+			// Round up to nearest multiple of 16
+			args->audio_interleave = (args->audio_interleave + 15) & ~15;
+			return parsed;
+
+		case 'a':
+			return parse_int(&(args->alignment), "alignment", param, 1, -1);
+
+		case 'L':
+			args->flags |= FLAG_SPU_LOOP_END;
+			return 1;
+
+		case 'D':
+			args->flags |= FLAG_SPU_NO_LEADING_DUMMY;
+			return 1;
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const bs_options_help =
+	"Video options:\n"
+	"    [-v v2|v3|v3dc] [-s WxH] [-I]\n"
+	"\n"
+	"    -v codec          Use specified video codec\n"
+	"                        v2:   MDEC BS v2 (default)\n"
+	"                        v3:   MDEC BS v3\n"
+	"                        v3dc: MDEC BS v3, expect decoder to wrap DC coefficients\n"
+	"    -s WxH            Rescale input file to fit within specified size\n"
+	"                      (16x16-640x512 in 16-pixel increments, default 320x240)\n"
+	"    -I                Force stretching to given size without preserving aspect ratio\n"
+	"\n";
+
+const char *const bs_codec_names[NUM_BS_CODECS] = {
+	"v2",
+	"v3",
+	"v3dc"
+};
+
+static int parse_bs_option(args_t *args, char option, const char *param) {
+	char *next = NULL;
+
+	switch (option) {
+		case 'v':
+			return parse_enum(&(args->video_codec), "video codec", param, bs_codec_names, NUM_BS_CODECS);
+
+		case 's':
+			if (param == NULL) {
+				fprintf(stderr, "Missing video size after option\n");
+				return INVALID_PARAM;
+			}
+
+			args->video_width = strtol(param, &next, 10);
+
+			if (next && *next == 'x') {
+				args->video_height = strtol(next + 1, NULL, 10);
+			} else {
+				fprintf(stderr, "Invalid video size (must be specified as <width>x<height>)\n");
+				return INVALID_PARAM;
+			}
+
+			if (args->video_width < 16 || args->video_width > 640) {
+				fprintf(stderr, "Invalid video width: %d (must be in 16-640 range)\n", args->video_width);
+				return INVALID_PARAM;
+			}
+			if (args->video_height < 16 || args->video_height > 512) {
+				fprintf(stderr, "Invalid video height: %d (must be in 16-512 range)\n", args->video_height);
+				return INVALID_PARAM;
+			}
+
+			// Round up to nearest multiples of 16
+			args->video_width = (args->video_width + 15) & ~15;
+			args->video_height = (args->video_height + 15) & ~15;
+			return 2;
+
+		case 'I':
+			args->flags |= FLAG_BS_IGNORE_ASPECT;
+			return 1;
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const str_options_help =
+	".str container options:\n"
+	"    [-r num[/den]] [-x 1|2] [-T id] [-A id] [-X]\n"
+	"\n"
+	"    -r num[/den]      Set video frame rate to specified integer or fraction (default 15)\n"
+	"    -x 1|2            Set CD-ROM speed the file is meant to played at (default 2)\n"
+	"    -T id             Tag video sectors with specified .str type ID (default 0x8001)\n"
+	"    -A id             Tag SPU-ADPCM sectors with specified .str type ID (default 0x0001)\n"
+	"    -X                Place audio sectors after corresponding video sectors\n"
+	"                      (rather than ahead of them)\n"
+	"\n";
+
+static int parse_str_option(args_t *args, char option, const char *param) {
+	char *next = NULL;
+	int fps;
+
+	switch (option) {
+		case 'r':
+			if (param == NULL) {
+				fprintf(stderr, "Missing frame rate value after option\n");
+				return INVALID_PARAM;
+			}
+
+			args->str_fps_num = strtol(param, &next, 10);
+
+			if (next && *next == '/')
+				args->str_fps_den = strtol(next + 1, NULL, 10);
+			else
+				args->str_fps_den = 1;
+
+			if (args->str_fps_num <= 0 || args->str_fps_den <= 0) {
+				fprintf(stderr, "Invalid frame rate (must be a non-zero integer or fraction)\n");
+				return INVALID_PARAM;
+			}
+
+			fps = args->str_fps_num / args->str_fps_den;
+
+			if (fps < 1 || fps > 60) {
+				fprintf(stderr, "Invalid frame rate: %d/%d (must be in 1-60 range)\n", args->str_fps_num, args->str_fps_den);
+				return INVALID_PARAM;
+			}
+			return 2;
+
+		case 'x':
+			return parse_int_one_of(&(args->str_cd_speed), "CD-ROM speed", param, 1, 2);
+
+		case 'T':
+			return parse_int(&(args->str_video_id), "video track type ID", param, 0x0000, 0xFFFF);
+
+		case 'A':
+			return parse_int(&(args->str_audio_id), "audio track type ID", param, 0x0000, 0xFFFF);
+
+		case 'X':
+			args->flags |= FLAG_STR_TRAILING_AUDIO;
+			return 1;
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const sbs_options_help =
+	".sbs container options:\n"
+	"    [-a size]\n"
+	"\n"
+	"    -a size           Set size of each video frame (default 8192)\n"
+	"\n";
+
+static int parse_sbs_option(args_t *args, char option, const char *param) {
+	switch (option) {
+		case 'a':
+			return parse_int(&(args->alignment), "video frame size", param, 256, -1);
+
+		default:
+			return 0;
+	}
+}
+
+static const char *const general_usage =
+	"Usage:\n"
+	"    psxavenc -t xa|xacd   [xa-options]                              <in> <out.xa>\n"
+	"    psxavenc -t spu|vag   [spu-options]                             <in> <out.vag>\n"
+	"    psxavenc -t spui|vagi [spui-options]                            <in> <out.vag>\n"
+	"    psxavenc -t str|strcd [xa-options]   [bs-options] [str-options] <in> <out.str>\n"
+	//"    psxavenc -t strspu    [spui-options] [bs-options] [str-options] <in> <out.str>\n"
+	"    psxavenc -t strv                     [bs-options] [str-options] <in> <out.str>\n"
+	"    psxavenc -t sbs                      [bs-options] [sbs-options] <in> <out.sbs>\n"
+	"\n";
+
+static const struct {
+	const char *usage;
+	const char *audio_options_help;
+	const char *video_options_help;
+	const char *container_options_help;
+	int (*parse_audio_option)(args_t *, char, const char *);
+	int (*parse_video_option)(args_t *, char, const char *);
+	int (*parse_container_option)(args_t *, char, const char *);
+} format_info[NUM_FORMATS] = {
+	{
+		.usage = "psxavenc -t xa [xa-options] <in> <out.xa>",
+		.audio_options_help = xa_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_xa_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t xacd [xa-options] <in> <out.xa>",
+		.audio_options_help = xa_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_xa_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t spu [spu-options] <in> <out>",
+		.audio_options_help = spu_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_spu_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t vag [spu-options] <in> <out.vag>",
+		.audio_options_help = spu_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_spu_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t spui [spui-options] <in> <out>",
+		.audio_options_help = spui_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_spui_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t vagi [spui-options] <in> <out.vag>",
+		.audio_options_help = spui_options_help,
+		.video_options_help = NULL,
+		.container_options_help = NULL,
+		.parse_audio_option = parse_spui_option,
+		.parse_video_option = NULL,
+		.parse_container_option = NULL
+	}, {
+		.usage = "psxavenc -t str [xa-options] [bs-options] [str-options] <in> <out.str>",
+		.audio_options_help = xa_options_help,
+		.video_options_help = bs_options_help,
+		.container_options_help = str_options_help,
+		.parse_audio_option = parse_xa_option,
+		.parse_video_option = parse_bs_option,
+		.parse_container_option = parse_str_option
+	}, {
+		.usage = "psxavenc -t strcd [xa-options] [bs-options] [str-options] <in> <out.str>",
+		.audio_options_help = xa_options_help,
+		.video_options_help = bs_options_help,
+		.container_options_help = str_options_help,
+		.parse_audio_option = parse_xa_option,
+		.parse_video_option = parse_bs_option,
+		.parse_container_option = parse_str_option
+	}, {
+		.usage = "psxavenc -t strspu [spui-options] [bs-options] [str-options] <in> <out.str>",
+		.audio_options_help = spui_options_help,
+		.video_options_help = bs_options_help,
+		.container_options_help = str_options_help,
+		.parse_audio_option = parse_spui_option,
+		.parse_video_option = parse_bs_option,
+		.parse_container_option = parse_str_option
+	}, {
+		.usage = "psxavenc -t strv [bs-options] [str-options] <in> <out.str>",
+		.audio_options_help = NULL,
+		.video_options_help = bs_options_help,
+		.container_options_help = str_options_help,
+		.parse_audio_option = NULL,
+		.parse_video_option = parse_bs_option,
+		.parse_container_option = parse_str_option
+	}, {
+		.usage = "psxavenc -t sbs [bs-options] [sbs-options] <in> <out.sbs>",
+		.audio_options_help = NULL,
+		.video_options_help = bs_options_help,
+		.container_options_help = sbs_options_help,
+		.parse_audio_option = NULL,
+		.parse_video_option = parse_bs_option,
+		.parse_container_option = parse_sbs_option
+	}
+};
+
+static int parse_option(args_t *args, char option, const char *param) {
+	int parsed = parse_general_option(args, option, param);
+
+	if (parsed == 0 && args->format != FORMAT_INVALID) {
+		if (format_info[args->format].parse_audio_option != NULL)
+			parsed = format_info[args->format].parse_audio_option(args, option, param);
+	}
+	if (parsed == 0 && args->format != FORMAT_INVALID) {
+		if (format_info[args->format].parse_video_option != NULL)
+			parsed = format_info[args->format].parse_video_option(args, option, param);
+	}
+	if (parsed == 0 && args->format != FORMAT_INVALID) {
+		if (format_info[args->format].parse_container_option != NULL)
+			parsed = format_info[args->format].parse_container_option(args, option, param);
+	}
+	if (parsed == 0) {
+		if (args->format == FORMAT_INVALID)
+			fprintf(
+				stderr,
+				"Unknown general option: -%c\n"
+				"(if this is a format-specific option, it shall be passed after -t)\n",
+				option
+			);
+		else
+			fprintf(stderr, "Unknown option for format %s: -%c\n", format_names[args->format], option);
+	}
+
+	return parsed;
+}
+
+static void print_help(format_t format) {
+	if (format == FORMAT_INVALID) {
+		printf(
+			"%s%s%s%s%s%s%s%s",
+			general_usage,
+			general_options_help,
+			xa_options_help,
+			spu_options_help,
+			spui_options_help,
+			bs_options_help,
+			str_options_help,
+			sbs_options_help
+		);
+		return;
+	}
+
+	printf(
+		"Usage:\n"
+		"    %s\n"
+		"\n"
+		"%s",
+		format_info[format].usage,
+		general_options_help
+	);
+	if (format_info[format].audio_options_help != NULL)
+		printf("%s", format_info[format].audio_options_help);
+	if (format_info[format].video_options_help != NULL)
+		printf("%s", format_info[format].video_options_help);
+	if (format_info[format].container_options_help != NULL)
+		printf("%s", format_info[format].container_options_help);
+}
+
+bool parse_args(args_t *args, const char *const *options, int count) {
+	int arg_index = 0;
+
+	while (arg_index < count) {
+		const char *option = options[arg_index];
+
+		if (option[0] == '-' && option[2] == 0 && !(args->flags & FLAG_IGNORE_OPTIONS)) {
+			const char *param;
+			if ((arg_index + 1) < count)
+				param = options[arg_index + 1];
+			else
+				param = NULL;
+
+			int parsed = parse_option(args, option[1], param);
+			if (parsed <= 0)
+				return false;
+
+			arg_index += parsed;
+			continue;
+		}
+
+		if (args->input_file == NULL) {
+			args->input_file = option;
+		} else if (args->output_file == NULL) {
+			args->output_file = option;
+		} else {
+			fprintf(stderr, "There should be no arguments after the output file path\n");
+			return false;
+		}
+		arg_index++;
+	}
+
+	if (args->flags & FLAG_PRINT_HELP) {
+		print_help(args->format);
+		return false;
+	}
+	if (args->flags & FLAG_PRINT_VERSION) {
+		printf("psxavenc " VERSION "\n");
+		return false;
+	}
+	if (args->format == FORMAT_INVALID || args->input_file == NULL || args->output_file == NULL) {
+		fprintf(
+			stderr,
+			"%s"
+			"For more information about the options supported for a given output format, run:\n"
+			"    psxavenc -t <format> -h\n"
+			"To view the full list of supported options, run:\n"
+			"    psxavenc -h\n",
+			general_usage
+		);
+		return false;
+	}
+
+	return true;
+}
--- a/psxavenc/args.h
+++ b/psxavenc/args.h
@ -0,0 +1,95 @@
+/*
+psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
+
+Copyright (c) 2019, 2020 Adrian "asie" Siekierka
+Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023, 2025 spicyjpeg
+
+This software is provided 'as-is', without any express or implied
+warranty. In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
+*/
+
+#pragma once
+
+#include <stdbool.h>
+
+#define NUM_FORMATS   11
+#define NUM_BS_CODECS 3
+
+enum {
+	FLAG_IGNORE_OPTIONS       = 1 << 0,
+	FLAG_QUIET                = 1 << 1,
+	FLAG_HIDE_PROGRESS        = 1 << 2,
+	FLAG_PRINT_HELP           = 1 << 3,
+	FLAG_PRINT_VERSION        = 1 << 4,
+	FLAG_SPU_LOOP_END         = 1 << 5,
+	FLAG_SPU_NO_LEADING_DUMMY = 1 << 6,
+	FLAG_BS_IGNORE_ASPECT     = 1 << 7,
+	FLAG_STR_TRAILING_AUDIO   = 1 << 8
+};
+
+typedef enum {
+	FORMAT_INVALID = -1,
+	FORMAT_XA,
+	FORMAT_XACD,
+	FORMAT_SPU,
+	FORMAT_VAG,
+	FORMAT_SPUI,
+	FORMAT_VAGI,
+	FORMAT_STR,
+	FORMAT_STRCD,
+	FORMAT_STRSPU,
+	FORMAT_STRV,
+	FORMAT_SBS
+} format_t;
+
+typedef enum {
+	BS_CODEC_INVALID = -1,
+	BS_CODEC_V2,
+	BS_CODEC_V3,
+	BS_CODEC_V3DC
+} bs_codec_t;
+
+typedef struct {
+	int flags;
+
+	format_t format;
+	const char *input_file;
+	const char *output_file;
+	const char *swresample_options;
+	const char *swscale_options;
+
+	int audio_frequency; // 18900 or 37800 Hz
+	int audio_channels;
+	int audio_bit_depth; // 4 or 8
+	int audio_xa_file; // 00-FF
+	int audio_xa_channel; // 00-1F
+	int audio_interleave;
+	int audio_loop_point;
+
+	bs_codec_t video_codec;
+	int video_width;
+	int video_height;
+
+	int str_fps_num;
+	int str_fps_den;
+	int str_cd_speed; // 1 or 2
+	int str_video_id;
+	int str_audio_id;
+	int alignment;
+} args_t;
+
+bool parse_args(args_t *args, const char *const *options, int count);
--- a/psxavenc/common.h
+++ b/psxavenc/common.h
@ -1,146 +0,0 @@
-/*
-psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
-
-Copyright (c) 2019, 2020 Adrian "asie" Siekierka
-Copyright (c) 2019 Ben "GreaseMonkey" Russell
-
-This software is provided 'as-is', without any express or implied
-warranty. In no event will the authors be held liable for any damages
-arising from the use of this software.
-
-Permission is granted to anyone to use this software for any purpose,
-including commercial applications, and to alter it and redistribute it
-freely, subject to the following restrictions:
-
-1. The origin of this software must not be misrepresented; you must not
-   claim that you wrote the original software. If you use this software
-   in a product, an acknowledgment in the product documentation would be
-   appreciated but is not required.
-2. Altered source versions must be plainly marked as such, and must not be
-   misrepresented as being the original software.
-3. This notice may not be removed or altered from any source distribution.
-*/
-
-#include <assert.h>
-#include <getopt.h>
-#include <stdbool.h>
-#include <stdint.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <math.h>
-#include <time.h>
-#include <unistd.h>
-
-#include <libavutil/opt.h>
-#include <libavcodec/avcodec.h>
-#include <libavformat/avformat.h>
-#include <libswscale/swscale.h>
-#include <libswresample/swresample.h>
-#include <libpsxav.h>
-
-#define NUM_FORMATS 9
-#define FORMAT_XA 0
-#define FORMAT_XACD 1
-#define FORMAT_SPU 2
-#define FORMAT_SPUI 3
-#define FORMAT_VAG 4
-#define FORMAT_VAGI 5
-#define FORMAT_STR2 6
-#define FORMAT_STR2CD 7
-#define FORMAT_SBS2 8
-
-typedef struct {
-	int frame_index;
-	int frame_data_offset;
-	int frame_max_size;
-	int frame_block_base_overflow;
-	int frame_block_overflow_num;
-	int frame_block_overflow_den;
-	uint16_t bits_value;
-	int bits_left;
-	uint8_t *frame_output;
-	int bytes_used;
-	int blocks_used;
-	int uncomp_hwords_used;
-	int quant_scale;
-	int quant_scale_sum;
-	float *dct_block_lists[6];
-} vid_encoder_state_t;
-
-typedef struct {
-	int video_frame_dst_size;
-	int audio_stream_index;
-	int video_stream_index;
-	AVFormatContext* format;
-	AVStream* audio_stream;
-	AVStream* video_stream;
-	AVCodecContext* audio_codec_context;
-	AVCodecContext* video_codec_context;
-	struct SwrContext* resampler;
-	struct SwsContext* scaler;
-	AVFrame* frame;
-
-	int sample_count_mul;
-
-	double video_next_pts;
-} av_decoder_state_t;
-
-typedef struct {
-	bool quiet;
-	bool show_progress;
-
-	int format; // FORMAT_*
-	int channels;
-	int cd_speed; // 1 or 2
-	int frequency; // 18900 or 37800 Hz
-	int bits_per_sample; // 4 or 8
-	int file_number; // 00-FF
-	int channel_number; // 00-1F
-	int interleave;
-	int alignment;
-	bool loop;
-
-	int video_width;
-	int video_height;
-	int video_fps_num; // FPS numerator
-	int video_fps_den; // FPS denominator
-	bool ignore_aspect_ratio;
-
-	char *swresample_options;
-	char *swscale_options;
-
-	int16_t *audio_samples;
-	int audio_sample_count;
-	uint8_t *video_frames;
-	int video_frame_count;
-
-	av_decoder_state_t decoder_state_av;
-	vid_encoder_state_t state_vid;
-	bool end_of_input;
-
-	time_t start_time;
-	time_t last_progress_update;
-} settings_t;
-
-// cdrom.c
-void init_sector_buffer_video(uint8_t *buffer, settings_t *settings);
-void calculate_edc_data(uint8_t *buffer);
-
-// decoding.c
-bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bool use_video, bool audio_required, bool video_required);
-bool poll_av_data(settings_t *settings);
-bool ensure_av_data(settings_t *settings, int needed_audio_samples, int needed_video_frames);
-void retire_av_data(settings_t *settings, int retired_audio_samples, int retired_video_frames);
-void close_av_data(settings_t *settings);
-
-// filefmt.c
-void encode_file_spu(settings_t *settings, FILE *output);
-void encode_file_spu_interleaved(settings_t *settings, FILE *output);
-void encode_file_xa(settings_t *settings, FILE *output);
-void encode_file_str(settings_t *settings, FILE *output);
-void encode_file_sbs(settings_t *settings, FILE *output);
-
-// mdec.c
-void encode_frame_bs(uint8_t *video_frame, settings_t *settings);
-void encode_sector_str(uint8_t *video_frames, uint8_t *output, settings_t *settings);
--- a/psxavenc/decoding.c
+++ b/psxavenc/decoding.c
@ -22,30 +22,52 @@ freely, subject to the following restrictions:
 3. This notice may not be removed or altered from any source distribution.
 */

-#include "common.h"
-
-int decode_frame(AVCodecContext *codec, AVFrame *frame, int *frame_size, AVPacket *packet) {
-	int ret;
+#include <assert.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <libavutil/opt.h>
+#include <libavcodec/avcodec.h>
+#include <libavcodec/avdct.h>
+#include <libavformat/avformat.h>
+#include <libswresample/swresample.h>
+#include <libswscale/swscale.h>
+#include "args.h"
+#include "decoding.h"

+static bool decode_frame(AVCodecContext *codec, AVFrame *frame, int *frame_size, AVPacket *packet) {
 	if (packet != NULL) {
-		ret = avcodec_send_packet(codec, packet);
-		if (ret != 0) {
-			return 0;
-		}
+		if (avcodec_send_packet(codec, packet) != 0)
+			return false;
 	}

-	ret = avcodec_receive_frame(codec, frame);
+	int ret = avcodec_receive_frame(codec, frame);
+
 	if (ret >= 0) {
 		*frame_size = ret;
-		return 1;
-	} else {
-		return ret == AVERROR(EAGAIN) ? 1 : 0;
+		return true;
 	}
+	if (ret == AVERROR(EAGAIN))
+		return true;
+
+	return false;
 }

-bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bool use_video, bool audio_required, bool video_required)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
+bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
+	decoder->audio_samples = NULL;
+	decoder->audio_sample_count = 0;
+	decoder->video_frames = NULL;
+	decoder->video_frame_count = 0;
+
+	decoder->video_width = args->video_width;
+	decoder->video_height = args->video_height;
+	decoder->video_fps_num = args->str_fps_num;
+	decoder->video_fps_den = args->str_fps_den;
+	decoder->end_of_input = false;
+
+	decoder_state_t *av = &(decoder->state);
+
 	av->video_next_pts = 0.0;
 	av->frame = NULL;
 	av->video_frame_dst_size = 0;
@ -59,19 +81,17 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 	av->resampler = NULL;
 	av->scaler = NULL;

-	if (settings->quiet) {
+	if (args->flags & FLAG_QUIET)
 		av_log_set_level(AV_LOG_QUIET);
-	}

 	av->format = avformat_alloc_context();
-	if (avformat_open_input(&(av->format), filename, NULL, NULL)) {
-		return false;
-	}
-	if (avformat_find_stream_info(av->format, NULL) < 0) {
-		return false;
-	}

-	if (use_audio) {
+	if (avformat_open_input(&(av->format), args->input_file, NULL, NULL))
+		return false;
+	if (avformat_find_stream_info(av->format, NULL) < 0)
+		return false;
+
+	if (flags & DECODER_USE_AUDIO) {
 		for (int i = 0; i < av->format->nb_streams; i++) {
 			if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
 				if (av->audio_stream_index >= 0) {
@ -81,13 +101,14 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 				av->audio_stream_index = i;
 			}
 		}
-		if (audio_required && av->audio_stream_index == -1) {
+
+		if ((flags & DECODER_AUDIO_REQUIRED) && av->audio_stream_index == -1) {
 			fprintf(stderr, "Input file has no audio data\n");
 			return false;
 		}
 	}

-	if (use_video) {
+	if (flags & DECODER_USE_VIDEO) {
 		for (int i = 0; i < av->format->nb_streams; i++) {
 			if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
 				if (av->video_stream_index >= 0) {
@ -97,7 +118,8 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 				av->video_stream_index = i;
 			}
 		}
-		if (video_required && av->video_stream_index == -1) {
+
+		if ((flags & DECODER_VIDEO_REQUIRED) && av->video_stream_index == -1) {
 			fprintf(stderr, "Input file has no video data\n");
 			return false;
 		}
@ -109,34 +131,39 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 	if (av->audio_stream != NULL) {
 		const AVCodec *codec = avcodec_find_decoder(av->audio_stream->codecpar->codec_id);
 		av->audio_codec_context = avcodec_alloc_context3(codec);
-		if (av->audio_codec_context == NULL) {
+
+		if (av->audio_codec_context == NULL)
 			return false;
-		}
-		if (avcodec_parameters_to_context(av->audio_codec_context, av->audio_stream->codecpar) < 0) {
+		if (avcodec_parameters_to_context(av->audio_codec_context, av->audio_stream->codecpar) < 0)
 			return false;
-		}
-		if (avcodec_open2(av->audio_codec_context, codec, NULL) < 0) {
+		if (avcodec_open2(av->audio_codec_context, codec, NULL) < 0)
 			return false;
-		}

 		AVChannelLayout layout;
-		layout.nb_channels = settings->channels;
-		if (settings->channels <= 2) {
+		layout.nb_channels = args->audio_channels;
+
+		if (args->audio_channels == 1) {
 			layout.order = AV_CHANNEL_ORDER_NATIVE;
-			layout.u.mask = (settings->channels == 2) ? AV_CH_LAYOUT_STEREO : AV_CH_LAYOUT_MONO;
+			layout.u.mask = AV_CH_LAYOUT_MONO;
+		} else if (args->audio_channels == 2) {
+			layout.order = AV_CHANNEL_ORDER_NATIVE;
+			layout.u.mask = AV_CH_LAYOUT_STEREO;
 		} else {
 			layout.order = AV_CHANNEL_ORDER_UNSPEC;
 		}
-		if (!settings->quiet && settings->channels > av->audio_codec_context->ch_layout.nb_channels) {
-			fprintf(stderr, "Warning: input file has less than %d channels\n", settings->channels);
+
+		if (!(args->flags & FLAG_QUIET)) {
+			if (args->audio_channels > av->audio_codec_context->ch_layout.nb_channels)
+				fprintf(stderr, "Warning: input file has less than %d channels\n", args->audio_channels);
 		}

-		av->sample_count_mul = settings->channels;
+		av->sample_count_mul = args->audio_channels;
+
 		if (swr_alloc_set_opts2(
 			&av->resampler,
 			&layout,
 			AV_SAMPLE_FMT_S16,
-			settings->frequency,
+			args->audio_frequency,
 			&av->audio_codec_context->ch_layout,
 			av->audio_codec_context->sample_fmt,
 			av->audio_codec_context->sample_rate,
@ -145,47 +172,43 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 		) < 0) {
 			return false;
 		}
-		if (settings->swresample_options) {
-			if (av_opt_set_from_string(av->resampler, settings->swresample_options, NULL, "=", ":,") < 0) {
+		if (args->swresample_options) {
+			if (av_opt_set_from_string(av->resampler, args->swresample_options, NULL, "=", ":,") < 0)
 				return false;
-			}
 		}
-
-		if (swr_init(av->resampler) < 0) {
+		if (swr_init(av->resampler) < 0)
 			return false;
-		}
 	}

 	if (av->video_stream != NULL) {
 		const AVCodec *codec = avcodec_find_decoder(av->video_stream->codecpar->codec_id);
 		av->video_codec_context = avcodec_alloc_context3(codec);
-		if(av->video_codec_context == NULL) {
+
+		if (av->video_codec_context == NULL)
 			return false;
-		}
-		if (avcodec_parameters_to_context(av->video_codec_context, av->video_stream->codecpar) < 0) {
+		if (avcodec_parameters_to_context(av->video_codec_context, av->video_stream->codecpar) < 0)
 			return false;
-		}
-		if (avcodec_open2(av->video_codec_context, codec, NULL) < 0) {
+		if (avcodec_open2(av->video_codec_context, codec, NULL) < 0)
 			return false;
+
+		if (!(args->flags & FLAG_QUIET)) {
+			if (
+				decoder->video_width > av->video_codec_context->width ||
+				decoder->video_height > av->video_codec_context->height
+			)
+				fprintf(stderr, "Warning: input file has resolution lower than %dx%d\n", decoder->video_width, decoder->video_height);
 		}

-		if (!settings->quiet && (
-			settings->video_width > av->video_codec_context->width ||
-			settings->video_height > av->video_codec_context->height
-		)) {
-			fprintf(stderr, "Warning: input file has resolution lower than %dx%d\n",
-				settings->video_width, settings->video_height
-			);
-		}
-		if (!settings->ignore_aspect_ratio) {
+		if (!(args->flags & FLAG_BS_IGNORE_ASPECT)) {
 			// Reduce the provided size so that it matches the input file's
 			// aspect ratio.
 			double src_ratio = (double)av->video_codec_context->width / (double)av->video_codec_context->height;
-			double dst_ratio = (double)settings->video_width / (double)settings->video_height;
+			double dst_ratio = (double)decoder->video_width / (double)decoder->video_height;
+
 			if (src_ratio < dst_ratio) {
-				settings->video_width = (int)((double)settings->video_height * src_ratio + 15.0) & ~15;
+				decoder->video_width = (int)((double)decoder->video_height * src_ratio + 15.0) & ~15;
 			} else {
-				settings->video_height = (int)((double)settings->video_width / src_ratio + 15.0) & ~15;
+				decoder->video_height = (int)((double)decoder->video_width / src_ratio + 15.0) & ~15;
 			}
 		}

@ -193,219 +216,248 @@ bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bo
 			av->video_codec_context->width,
 			av->video_codec_context->height,
 			av->video_codec_context->pix_fmt,
-			settings->video_width,
-			settings->video_height,
+			decoder->video_width,
+			decoder->video_height,
 			AV_PIX_FMT_NV21,
 			SWS_BICUBIC,
 			NULL,
 			NULL,
 			NULL
 		);
-		if (av->scaler == NULL) {
+		if (av->scaler == NULL)
 			return false;
-		}
-#if 0
-		// FIXME: if this is uncommented libswscale may produce completely black
-		// frames for whatever reason...
 		if (sws_setColorspaceDetails(
 			av->scaler,
 			sws_getCoefficients(av->video_codec_context->colorspace),
-			(av->video_codec_context->color_range == AVCOL_RANGE_JPEG),
+			av->video_codec_context->color_range == AVCOL_RANGE_JPEG,
 			sws_getCoefficients(SWS_CS_ITU601),
 			true,
 			0,
-			0,
-			0
-		) < 0) {
+			1 << 16,
+			1 << 16
+		) < 0)
 			return false;
-		}
-#endif
-		if (settings->swscale_options) {
-			if (av_opt_set_from_string(av->scaler, settings->swscale_options, NULL, "=", ":,") < 0) {
+		if (args->swscale_options) {
+			if (av_opt_set_from_string(av->scaler, args->swscale_options, NULL, "=", ":,") < 0)
 				return false;
-			}
 		}

-		av->video_frame_dst_size = 3*settings->video_width*settings->video_height/2;
+		av->video_frame_dst_size = 3 * decoder->video_width * decoder->video_height / 2;
 	}

 	av->frame = av_frame_alloc();
-	if (av->frame == NULL) {
-		return false;
-	}

-	settings->audio_samples = NULL;
-	settings->audio_sample_count = 0;
-	settings->video_frames = NULL;
-	settings->video_frame_count = 0;
-	settings->end_of_input = false;
+	if (av->frame == NULL)
+		return false;

 	return true;
 }

-static void poll_av_packet_audio(settings_t *settings, AVPacket *packet)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
-
-	int frame_size, frame_sample_count;
-	uint8_t *buffer[1];
-
-	if (decode_frame(av->audio_codec_context, av->frame, &frame_size, packet)) {
-		size_t buffer_size = sizeof(int16_t) * av->sample_count_mul * swr_get_out_samples(av->resampler, av->frame->nb_samples);
-		buffer[0] = malloc(buffer_size);
-		memset(buffer[0], 0, buffer_size);
-		frame_sample_count = swr_convert(av->resampler, buffer, av->frame->nb_samples, (const uint8_t**)av->frame->data, av->frame->nb_samples);
-		settings->audio_samples = realloc(settings->audio_samples, (settings->audio_sample_count + ((frame_sample_count + 4032) * av->sample_count_mul)) * sizeof(int16_t));
-		memmove(&(settings->audio_samples[settings->audio_sample_count]), buffer[0], sizeof(int16_t) * frame_sample_count * av->sample_count_mul);
-		settings->audio_sample_count += frame_sample_count * av->sample_count_mul;
-		free(buffer[0]);
-	}
-}
-
-static void poll_av_packet_video(settings_t *settings, AVPacket *packet)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
+static void poll_av_packet_audio(decoder_t *decoder, AVPacket *packet) {
+	decoder_state_t *av = &(decoder->state);

 	int frame_size;
-	double pts_step = ((double)1.0*(double)settings->video_fps_den)/(double)settings->video_fps_num;

-	int plane_size = settings->video_width*settings->video_height;
-	int dst_strides[2] = {
-		settings->video_width, settings->video_width
-	};
+	if (!decode_frame(av->audio_codec_context, av->frame, &frame_size, packet))
+		return;

-	if (decode_frame(av->video_codec_context, av->frame, &frame_size, packet)) {
-		if (!av->frame->width || !av->frame->height || !av->frame->data[0]) {
-			return;
-		}
+	int frame_sample_count = swr_get_out_samples(av->resampler, av->frame->nb_samples);

-		// Some files seem to have timestamps starting from a negative value
-		// (but otherwise valid) for whatever reason.
-		double pts = (((double)av->frame->pts)*(double)av->video_stream->time_base.num)/av->video_stream->time_base.den;
-		//if (pts < 0.0) {
-			//return;
-		//}
-		if (settings->video_frame_count >= 1 && pts < av->video_next_pts) {
-			return;
-		}
-		if ((settings->video_frame_count) < 1) {
-			av->video_next_pts = pts;
-		} else {
-			av->video_next_pts += pts_step;
-		}
+	if (frame_sample_count == 0)
+		return;

-		//fprintf(stderr, "%d %f %f %f\n", (settings->video_frame_count), pts, av->video_next_pts, pts_step);
+	size_t buffer_size = sizeof(int16_t) * av->sample_count_mul * frame_sample_count;
+	uint8_t *buffer = malloc(buffer_size);
+	memset(buffer, 0, buffer_size);

-		// Insert duplicate frames if the frame rate of the input stream is
-		// lower than the target frame rate.
-		int dupe_frames = (int) ceil((pts - av->video_next_pts) / pts_step);
-		if (dupe_frames < 0) dupe_frames = 0;
-		settings->video_frames = realloc(
-			settings->video_frames,
-			(settings->video_frame_count + dupe_frames + 1) * av->video_frame_dst_size
-		);
+	frame_sample_count = swr_convert(
+		av->resampler,
+		&buffer,
+		frame_sample_count,
+		(const uint8_t**)av->frame->data,
+		av->frame->nb_samples
+	);

-		for (; dupe_frames; dupe_frames--) {
-			memcpy(
-				(settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count),
-				(settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count-1),
-				av->video_frame_dst_size
-			);
-			settings->video_frame_count += 1;
-			av->video_next_pts += pts_step;
-		}
-
-		uint8_t *dst_frame = (settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count);
-		uint8_t *dst_pointers[2] = {
-			dst_frame, dst_frame + plane_size
-		};
-		sws_scale(av->scaler, (const uint8_t *const *) av->frame->data, av->frame->linesize, 0, av->frame->height, dst_pointers, dst_strides);
-
-		settings->video_frame_count += 1;
-	}
+	decoder->audio_samples = realloc(
+		decoder->audio_samples,
+		(decoder->audio_sample_count + ((frame_sample_count + 4032) * av->sample_count_mul)) * sizeof(int16_t)
+	);
+	memmove(
+		&(decoder->audio_samples[decoder->audio_sample_count]),
+		buffer,
+		sizeof(int16_t) * frame_sample_count * av->sample_count_mul
+	);
+	decoder->audio_sample_count += frame_sample_count * av->sample_count_mul;
+	free(buffer);
 }

-bool poll_av_data(settings_t *settings)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
-	AVPacket packet;
+static void poll_av_packet_video(decoder_t *decoder, AVPacket *packet) {
+	decoder_state_t *av = &(decoder->state);

-	if (settings->end_of_input) {
-		return false;
+	int frame_size;
+	double pts_step = (double)decoder->video_fps_den / (double)decoder->video_fps_num;
+
+	int plane_size = decoder->video_width * decoder->video_height;
+	int dst_strides[2] = {
+		decoder->video_width, decoder->video_width
+	};
+
+	if (!decode_frame(av->video_codec_context, av->frame, &frame_size, packet))
+		return;
+	if (!av->frame->width || !av->frame->height || !av->frame->data[0])
+		return;
+
+	// Some files seem to have timestamps starting from a negative value
+	// (but otherwise valid) for whatever reason.
+	double pts =
+		((double)av->frame->pts * (double)av->video_stream->time_base.num)
+		/ av->video_stream->time_base.den;
+#if 0
+	if (pts < 0.0)
+		return;
+#endif
+	if (decoder->video_frame_count >= 1 && pts < av->video_next_pts)
+		return;
+	if (decoder->video_frame_count < 1)
+		av->video_next_pts = pts;
+	else
+		av->video_next_pts += pts_step;
+
+	//fprintf(stderr, "%d %f %f %f\n", decoder->video_frame_count, pts, av->video_next_pts, pts_step);
+
+	// Insert duplicate frames if the frame rate of the input stream is
+	// lower than the target frame rate.
+	int dupe_frames = (int) ceil((pts - av->video_next_pts) / pts_step);
+	if (dupe_frames < 0) dupe_frames = 0;
+	decoder->video_frames = realloc(
+		decoder->video_frames,
+		(decoder->video_frame_count + dupe_frames + 1) * av->video_frame_dst_size
+	);
+
+	for (; dupe_frames; dupe_frames--) {
+		memcpy(
+			(decoder->video_frames) + av->video_frame_dst_size * decoder->video_frame_count,
+			(decoder->video_frames) + av->video_frame_dst_size * (decoder->video_frame_count - 1),
+			av->video_frame_dst_size
+		);
+		decoder->video_frame_count += 1;
+		av->video_next_pts += pts_step;
 	}

+	uint8_t *dst_frame = decoder->video_frames + av->video_frame_dst_size * decoder->video_frame_count;
+	uint8_t *dst_pointers[2] = {
+		dst_frame, dst_frame + plane_size
+	};
+	sws_scale(
+		av->scaler,
+		(const uint8_t *const *) av->frame->data,
+		av->frame->linesize,
+		0,
+		av->frame->height,
+		dst_pointers,
+		dst_strides
+	);
+
+	decoder->video_frame_count += 1;
+}
+
+bool poll_av_data(decoder_t *decoder) {
+	decoder_state_t *av = &(decoder->state);
+
+	if (decoder->end_of_input)
+		return false;
+
+	AVPacket packet;
+
 	if (av_read_frame(av->format, &packet) >= 0) {
-		if (packet.stream_index == av->audio_stream_index) {
-			poll_av_packet_audio(settings, &packet);
-		} else if (packet.stream_index == av->video_stream_index) {
-			poll_av_packet_video(settings, &packet);
-		}
+		if (packet.stream_index == av->audio_stream_index)
+			poll_av_packet_audio(decoder, &packet);
+		else if (packet.stream_index == av->video_stream_index)
+			poll_av_packet_video(decoder, &packet);
+
 		av_packet_unref(&packet);
 		return true;
 	} else {
 		// out is always padded out with 4032 "0" samples, this makes calculations elsewhere easier
-		if (av->audio_stream) {
-			memset((settings->audio_samples) + (settings->audio_sample_count), 0, 4032 * av->sample_count_mul * sizeof(int16_t));
-		}
+		if (av->audio_stream)
+			memset(
+				decoder->audio_samples + decoder->audio_sample_count,
+				0,
+				4032 * av->sample_count_mul * sizeof(int16_t)
+			);

-		settings->end_of_input = true;
+		decoder->end_of_input = true;
 		return false;
 	}
 }

-bool ensure_av_data(settings_t *settings, int needed_audio_samples, int needed_video_frames)
-{
-	while (settings->audio_sample_count < needed_audio_samples || settings->video_frame_count < needed_video_frames) {
-		//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", settings->audio_sample_count, needed_audio_samples, settings->video_frame_count, needed_video_frames);
-		if (!poll_av_data(settings)) {
+bool ensure_av_data(decoder_t *decoder, int needed_audio_samples, int needed_video_frames) {
+	// HACK: in order to update decoder->end_of_input as soon as all data has
+	// been read from the input file, this loop waits for more data than
+	// strictly needed.
+#if 0
+	while (decoder->audio_sample_count < needed_audio_samples || decoder->video_frame_count < needed_video_frames) {
+#else
+	while (
+		(needed_audio_samples && decoder->audio_sample_count <= needed_audio_samples) ||
+		(needed_video_frames && decoder->video_frame_count <= needed_video_frames)
+	) {
+#endif
+		//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", decoder->audio_sample_count, needed_audio_samples, decoder->video_frame_count, needed_video_frames);
+		if (!poll_av_data(decoder)) {
 			// Keep returning true even if the end of the input file has been
 			// reached, if the buffer is not yet completely empty.
-			return (settings->audio_sample_count || !needed_audio_samples)
-				&& (settings->video_frame_count || !needed_video_frames);
+			return
+				(decoder->audio_sample_count || !needed_audio_samples) &&
+				(decoder->video_frame_count || !needed_video_frames);
 		}
 	}
-	//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", settings->audio_sample_count, needed_audio_samples, settings->video_frame_count, needed_video_frames);
+	//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", decoder->audio_sample_count, needed_audio_samples, decoder->video_frame_count, needed_video_frames);

 	return true;
 }

-void retire_av_data(settings_t *settings, int retired_audio_samples, int retired_video_frames)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
-
-	//fprintf(stderr, "retire %d -> %d, %d -> %d\n", settings->audio_sample_count, retired_audio_samples, settings->video_frame_count, retired_video_frames);
-	assert(retired_audio_samples <= settings->audio_sample_count);
-	assert(retired_video_frames <= settings->video_frame_count);
+void retire_av_data(decoder_t *decoder, int retired_audio_samples, int retired_video_frames) {
+	//fprintf(stderr, "retire %d -> %d, %d -> %d\n", decoder->audio_sample_count, retired_audio_samples, decoder->video_frame_count, retired_video_frames);
+	assert(retired_audio_samples <= decoder->audio_sample_count);
+	assert(retired_video_frames <= decoder->video_frame_count);

 	int sample_size = sizeof(int16_t);
-	if (settings->audio_sample_count > retired_audio_samples) {
-		memmove(settings->audio_samples, settings->audio_samples + retired_audio_samples, (settings->audio_sample_count - retired_audio_samples)*sample_size);
-	}
-	settings->audio_sample_count -= retired_audio_samples;
+	int frame_size = decoder->state.video_frame_dst_size;

-	int frame_size = av->video_frame_dst_size;
-	if (settings->video_frame_count > retired_video_frames) {
-		memmove(settings->video_frames, settings->video_frames + retired_video_frames*frame_size, (settings->video_frame_count - retired_video_frames)*frame_size);
-	}
-	settings->video_frame_count -= retired_video_frames;
+	if (decoder->audio_sample_count > retired_audio_samples)
+		memmove(
+			decoder->audio_samples,
+			decoder->audio_samples + retired_audio_samples,
+			(decoder->audio_sample_count - retired_audio_samples) * sample_size
+		);
+	if (decoder->video_frame_count > retired_video_frames)
+		memmove(
+			decoder->video_frames,
+			decoder->video_frames + retired_video_frames * frame_size,
+			(decoder->video_frame_count - retired_video_frames) * frame_size
+		);
+
+	decoder->audio_sample_count -= retired_audio_samples;
+	decoder->video_frame_count -= retired_video_frames;
 }

-void close_av_data(settings_t *settings)
-{
-	av_decoder_state_t* av = &(settings->decoder_state_av);
+void close_av_data(decoder_t *decoder) {
+	decoder_state_t *av = &(decoder->state);

 	av_frame_free(&(av->frame));
 	swr_free(&(av->resampler));
+	// Deprecated, kept for compatibility with older FFmpeg versions.
 	avcodec_close(av->audio_codec_context);
 	avcodec_free_context(&(av->audio_codec_context));
 	avformat_free_context(av->format);

-	if(settings->audio_samples != NULL) {
-		free(settings->audio_samples);
-		settings->audio_samples = NULL;
+	if(decoder->audio_samples != NULL) {
+		free(decoder->audio_samples);
+		decoder->audio_samples = NULL;
 	}
-	if(settings->video_frames != NULL) {
-		free(settings->video_frames);
-		settings->video_frames = NULL;
+	if(decoder->video_frames != NULL) {
+		free(decoder->video_frames);
+		decoder->video_frames = NULL;
 	}
 }
--- a/psxavenc/decoding.h
+++ b/psxavenc/decoding.h
@ -0,0 +1,80 @@
+/*
+psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
+
+Copyright (c) 2019, 2020 Adrian "asie" Siekierka
+Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023, 2025 spicyjpeg
+
+This software is provided 'as-is', without any express or implied
+warranty. In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
+*/
+
+#pragma once
+
+#include <stdbool.h>
+#include <libavutil/opt.h>
+#include <libavcodec/avcodec.h>
+#include <libavcodec/avdct.h>
+#include <libavformat/avformat.h>
+#include <libswresample/swresample.h>
+#include <libswscale/swscale.h>
+#include "args.h"
+
+typedef struct {
+	int video_frame_dst_size;
+	int audio_stream_index;
+	int video_stream_index;
+	AVFormatContext* format;
+	AVStream* audio_stream;
+	AVStream* video_stream;
+	AVCodecContext* audio_codec_context;
+	AVCodecContext* video_codec_context;
+	struct SwrContext* resampler;
+	struct SwsContext* scaler;
+	AVFrame* frame;
+
+	int sample_count_mul;
+
+	double video_next_pts;
+} decoder_state_t;
+
+typedef struct {
+	int16_t *audio_samples;
+	int audio_sample_count;
+	uint8_t *video_frames;
+	int video_frame_count;
+
+	int video_width;
+	int video_height;
+	int video_fps_num;
+	int video_fps_den;
+	bool end_of_input;
+
+	decoder_state_t state;
+} decoder_t;
+
+enum {
+	DECODER_USE_AUDIO      = 1 << 0,
+	DECODER_USE_VIDEO      = 1 << 1,
+	DECODER_AUDIO_REQUIRED = 1 << 2,
+	DECODER_VIDEO_REQUIRED = 1 << 3
+};
+
+bool open_av_data(decoder_t *decoder, const args_t *args, int flags);
+bool poll_av_data(decoder_t *decoder);
+bool ensure_av_data(decoder_t *decoder, int needed_audio_samples, int needed_video_frames);
+void retire_av_data(decoder_t *decoder, int retired_audio_samples, int retired_video_frames);
+void close_av_data(decoder_t *decoder);
--- a/psxavenc/filefmt.c
+++ b/psxavenc/filefmt.c
@ -3,7 +3,7 @@ psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend

 Copyright (c) 2019, 2020 Adrian "asie" Siekierka
 Copyright (c) 2019 Ben "GreaseMonkey" Russell
-Copyright (c) 2023 spicyjpeg
+Copyright (c) 2023, 2025 spicyjpeg

 This software is provided 'as-is', without any express or implied
 warranty. In no event will the authors be held liable for any damages
@ -22,48 +22,87 @@ freely, subject to the following restrictions:
 3. This notice may not be removed or altered from any source distribution.
 */

-#include "common.h"
-#include "libpsxav.h"
+#include <assert.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include <libpsxav.h>
+#include "args.h"
+#include "decoding.h"
+#include "mdec.h"

-static time_t get_elapsed_time(settings_t *settings) {
-	if (!settings->show_progress) {
-		return 0;
+static time_t start_time = 0;
+static time_t last_progress_update = 0;
+
+static time_t get_elapsed_time(void) {
+	time_t t;
+
+	if (start_time > 0) {
+		t = time(NULL) - start_time;
+	} else {
+		t = 0;
+		start_time = time(NULL);
 	}
-	time_t t = time(NULL) - settings->start_time;
-	if (t <= settings->last_progress_update) {
+
+	if (t <= last_progress_update)
 		return 0;
-	}
-	settings->last_progress_update = t;
+
+	last_progress_update = t;
 	return t;
 }

-static psx_audio_xa_settings_t settings_to_libpsxav_xa_audio(settings_t *settings) {
-	psx_audio_xa_settings_t new_settings;
-	new_settings.bits_per_sample = settings->bits_per_sample;
-	new_settings.frequency = settings->frequency;
-	new_settings.stereo = settings->channels == 2;
-	new_settings.file_number = settings->file_number;
-	new_settings.channel_number = settings->channel_number;
+static psx_audio_xa_settings_t args_to_libpsxav_xa_audio(const args_t *args) {
+	psx_audio_xa_settings_t settings;

-	switch (settings->format) {
-		case FORMAT_XA:
-		case FORMAT_STR2:
-			new_settings.format = PSX_AUDIO_XA_FORMAT_XA;
-			break;
-		default:
-			new_settings.format = PSX_AUDIO_XA_FORMAT_XACD;
-			break;
-	}
+	settings.bits_per_sample = args->audio_bit_depth;
+	settings.frequency = args->audio_frequency;
+	settings.stereo = (args->audio_channels == 2);
+	settings.file_number = args->audio_xa_file;
+	settings.channel_number = args->audio_xa_channel;

-	return new_settings;
+	if (args->format == FORMAT_XACD || args->format == FORMAT_STRCD)
+		settings.format = PSX_AUDIO_XA_FORMAT_XACD;
+	else
+		settings.format = PSX_AUDIO_XA_FORMAT_XA;
+
+	return settings;
 };

-void write_vag_header(int size_per_channel, uint8_t *header, settings_t *settings) {
+static void init_sector_buffer_video(const args_t *args, uint8_t *sector, int lba) {
+	psx_cdrom_sector_xa_subheader_t *subheader = NULL;
+
+	if (args->format == FORMAT_STRCD) {
+		psx_cdrom_init_sector((psx_cdrom_sector_t *)sector, lba, PSX_CDROM_SECTOR_TYPE_MODE2_FORM1);
+		subheader = ((psx_cdrom_sector_t *)sector)->mode2.subheader;
+	} else if (args->format == FORMAT_STR) {
+		subheader = (psx_cdrom_sector_xa_subheader_t *)sector;
+	}
+
+	if (subheader != NULL) {
+		subheader->file = args->audio_xa_file;
+		subheader->channel = args->audio_xa_channel & PSX_CDROM_SECTOR_XA_CHANNEL_MASK;
+		subheader->submode = PSX_CDROM_SECTOR_XA_SUBMODE_DATA | PSX_CDROM_SECTOR_XA_SUBMODE_RT;
+		subheader->coding = 0;
+
+		memcpy(subheader + 1, subheader, sizeof(psx_cdrom_sector_xa_subheader_t));
+	}
+}
+
+#define VAG_HEADER_SIZE 0x30
+
+static void write_vag_header(const args_t *args, int size_per_channel, uint8_t *header) {
+	memset(header, 0, VAG_HEADER_SIZE);
+
 	// Magic
 	header[0x00] = 'V';
 	header[0x01] = 'A';
 	header[0x02] = 'G';
-	header[0x03] = settings->interleave ? 'i' : 'p';
+
+	if (args->format == FORMAT_VAGI)
+		header[0x03] = 'i';
+	else
+	 	header[0x03] = 'p';

 	// Version (big-endian)
 	header[0x04] = 0x00;
@ -72,311 +111,533 @@ void write_vag_header(int size_per_channel, uint8_t *header, settings_t *setting
 	header[0x07] = 0x20;

 	// Interleave (little-endian)
-	header[0x08] = (uint8_t)settings->interleave;
-	header[0x09] = (uint8_t)(settings->interleave>>8);
-	header[0x0a] = (uint8_t)(settings->interleave>>16);
-	header[0x0b] = (uint8_t)(settings->interleave>>24);
+	if (args->format == FORMAT_VAGI) {
+		header[0x08] = (uint8_t)args->audio_interleave;
+		header[0x09] = (uint8_t)(args->audio_interleave >> 8);
+		header[0x0A] = (uint8_t)(args->audio_interleave >> 16);
+		header[0x0B] = (uint8_t)(args->audio_interleave >> 24);
+	}

 	// Length of data for each channel (big-endian)
-	header[0x0c] = (uint8_t)(size_per_channel>>24);
-	header[0x0d] = (uint8_t)(size_per_channel>>16);
-	header[0x0e] = (uint8_t)(size_per_channel>>8);
-	header[0x0f] = (uint8_t)size_per_channel;
+	header[0x0C] = (uint8_t)(size_per_channel >> 24);
+	header[0x0D] = (uint8_t)(size_per_channel >> 16);
+	header[0x0E] = (uint8_t)(size_per_channel >> 8);
+	header[0x0F] = (uint8_t)size_per_channel;

 	// Sample rate (big-endian)
-	header[0x10] = (uint8_t)(settings->frequency>>24);
-	header[0x11] = (uint8_t)(settings->frequency>>16);
-	header[0x12] = (uint8_t)(settings->frequency>>8);
-	header[0x13] = (uint8_t)settings->frequency;
+	header[0x10] = (uint8_t)(args->audio_frequency >> 24);
+	header[0x11] = (uint8_t)(args->audio_frequency >> 16);
+	header[0x12] = (uint8_t)(args->audio_frequency >> 8);
+	header[0x13] = (uint8_t)args->audio_frequency;

 	// Number of channels (little-endian)
-	header[0x1e] = (uint8_t)settings->channels;
-	header[0x1f] = 0x00;
+	header[0x1E] = (uint8_t)args->audio_channels;
+	header[0x1F] = 0x00;

 	// Filename
-	//strncpy(header + 0x20, "psxavenc", 16);
-	memset(header + 0x20, 0, 16);
+	int name_offset = strlen(args->output_file);
+	while (
+		name_offset > 0 &&
+		args->output_file[name_offset - 1] != '/' &&
+		args->output_file[name_offset - 1] != '\\'
+	)
+		name_offset--;
+
+	strncpy((char*)(header + 0x20), &args->output_file[name_offset], 16);
 }

-void encode_file_spu(settings_t *settings, FILE *output) {
-	psx_audio_encoder_channel_state_t audio_state;	
-	int audio_samples_per_block = psx_audio_spu_get_samples_per_block();
-	int block_size = psx_audio_spu_get_buffer_size_per_block();
-	uint8_t buffer[16];
-	int block_count;
+// The functions below are some peak spaghetti code I would rewrite if that
+// didn't also require scrapping the rest of the codebase. -- spicyjpeg

+void encode_file_xa(const args_t *args, decoder_t *decoder, FILE *output) {
+	psx_audio_xa_settings_t xa_settings = args_to_libpsxav_xa_audio(args);
+
+	int audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
+
+	psx_audio_encoder_state_t audio_state;
+	memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));
+
+	int sector_count = 0;
+
+	for (; ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, 0); sector_count++) {
+		int samples_length = decoder->audio_sample_count / args->audio_channels;
+
+		if (samples_length > audio_samples_per_sector)
+			samples_length = audio_samples_per_sector;
+
+		uint8_t sector[PSX_CDROM_SECTOR_SIZE];
+		int length = psx_audio_xa_encode(
+			xa_settings,
+			&audio_state,
+			decoder->audio_samples,
+			samples_length,
+			sector_count,
+			sector
+		);
+
+		if (decoder->end_of_input)
+			psx_audio_xa_encode_finalize(xa_settings, sector, length);
+
+		retire_av_data(decoder, samples_length * args->audio_channels, 0);
+		fwrite(sector, length, 1, output);
+
+		time_t t = get_elapsed_time();
+
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rLBA: %6d | Encoding speed: %5.2fx",
+				sector_count,
+				(double)(sector_count * audio_samples_per_sector) / (double)(args->audio_frequency * t)
+			);
+		}
+	}
+}
+
+void encode_file_spu(const args_t *args, decoder_t *decoder, FILE *output) {
+	psx_audio_encoder_channel_state_t audio_state;
 	memset(&audio_state, 0, sizeof(psx_audio_encoder_channel_state_t));

 	// The header must be written after the data as we don't yet know the
 	// number of audio samples.
-	if (settings->format == FORMAT_VAG) {
-		fseek(output, 48, SEEK_SET);
+	if (args->format == FORMAT_VAG)
+		fseek(output, VAG_HEADER_SIZE, SEEK_SET);
+
+	uint8_t block[PSX_AUDIO_SPU_BLOCK_SIZE];
+	int block_count = 0;
+
+	if (!(args->flags & FLAG_SPU_NO_LEADING_DUMMY)) {
+		// Insert leading silent block
+		memset(block, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
+
+		fwrite(block, PSX_AUDIO_SPU_BLOCK_SIZE, 1, output);
+		block_count++;
 	}

-	for (block_count = 0; ensure_av_data(settings, audio_samples_per_block, 0); block_count++) {
-		int samples_length = settings->audio_sample_count;
-		if (samples_length > audio_samples_per_block) samples_length = audio_samples_per_block;
+	int loop_start_block = -1;

-		int length = psx_audio_spu_encode(&audio_state, settings->audio_samples, samples_length, 1, buffer);
-		if (!block_count) {
-			// This flag is not required as the SPU already resets the loop
-			// address when starting playback of a sample.
-			//buffer[1] |= PSX_AUDIO_SPU_LOOP_START;
-		}
-		if (settings->end_of_input) {
-			buffer[1] |= settings->loop ? PSX_AUDIO_SPU_LOOP_REPEAT : PSX_AUDIO_SPU_LOOP_END;
-		}
+	if (args->audio_loop_point >= 0)
+		loop_start_block = block_count + (args->audio_loop_point * args->audio_frequency) / (PSX_AUDIO_SPU_SAMPLES_PER_BLOCK * 1000);

-		retire_av_data(settings, samples_length, 0);
-		fwrite(buffer, length, 1, output);
+	for (; ensure_av_data(decoder, PSX_AUDIO_SPU_SAMPLES_PER_BLOCK, 0); block_count++) {
+		int samples_length = decoder->audio_sample_count;

-		time_t t = get_elapsed_time(settings);
-		if (t) {
-			fprintf(stderr, "\rBlock: %6d | Encoding speed: %5.2fx",
+		if (samples_length > PSX_AUDIO_SPU_SAMPLES_PER_BLOCK)
+			samples_length = PSX_AUDIO_SPU_SAMPLES_PER_BLOCK;
+
+		int length = psx_audio_spu_encode(
+			&audio_state,
+			decoder->audio_samples,
+			samples_length,
+			1,
+			block
+		);
+
+		if (block_count == loop_start_block)
+			block[1] |= PSX_AUDIO_SPU_LOOP_START;
+		if ((args->flags & FLAG_SPU_LOOP_END) && decoder->end_of_input)
+			block[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
+
+		retire_av_data(decoder, samples_length, 0);
+		fwrite(block, length, 1, output);
+
+		time_t t = get_elapsed_time();
+
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rBlock: %6d | Encoding speed: %5.2fx",
 				block_count,
-				(double)(block_count*audio_samples_per_block) / (double)(settings->frequency*t)
+				(double)(block_count * PSX_AUDIO_SPU_SAMPLES_PER_BLOCK) / (double)(args->audio_frequency * t)
 			);
 		}
 	}

-	if (settings->format == FORMAT_VAG) {
-		uint8_t header[48];
-		memset(header, 0, 48);
-		write_vag_header(block_count*block_size, header, settings);
+	if (!(args->flags & FLAG_SPU_LOOP_END)) {
+		// Insert trailing looping block
+		memset(block, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
+		block[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
+
+		fwrite(block, PSX_AUDIO_SPU_BLOCK_SIZE, 1, output);
+		block_count++;
+	}
+
+	int overflow = (block_count * PSX_AUDIO_SPU_BLOCK_SIZE) % args->alignment;
+
+	if (overflow) {
+		for (int i = 0; i < (args->alignment - overflow); i++)
+			fputc(0, output);
+	}
+	if (args->format == FORMAT_VAG) {
+		uint8_t header[VAG_HEADER_SIZE];
+		write_vag_header(args, block_count * PSX_AUDIO_SPU_BLOCK_SIZE, header);
+
 		fseek(output, 0, SEEK_SET);
-		fwrite(header, 48, 1, output);
+		fwrite(header, VAG_HEADER_SIZE, 1, output);
 	}
 }

-void encode_file_spu_interleaved(settings_t *settings, FILE *output) {
-	int audio_state_size = sizeof(psx_audio_encoder_channel_state_t) * settings->channels;
+void encode_file_spui(const args_t *args, decoder_t *decoder, FILE *output) {
+	int audio_samples_per_chunk = args->audio_interleave / PSX_AUDIO_SPU_BLOCK_SIZE * PSX_AUDIO_SPU_SAMPLES_PER_BLOCK;

 	// NOTE: since the interleaved .vag format is not standardized, some tools
 	// (such as vgmstream) will not properly play files with interleave < 2048,
 	// alignment != 2048 or channels != 2.
-	int buffer_size = settings->interleave + settings->alignment - 1;
-	buffer_size -= buffer_size % settings->alignment;
-	int header_size = 48 + settings->alignment - 1;
-	header_size -= header_size % settings->alignment;
+	int chunk_size = args->audio_interleave * args->audio_channels + args->alignment - 1;
+	chunk_size -= chunk_size % args->alignment;

+	int header_size = VAG_HEADER_SIZE + args->alignment - 1;
+	header_size -= header_size % args->alignment;
+
+	if (args->format == FORMAT_VAGI)
+		fseek(output, header_size, SEEK_SET);
+
+	int audio_state_size = sizeof(psx_audio_encoder_channel_state_t) * args->audio_channels;
 	psx_audio_encoder_channel_state_t *audio_state = malloc(audio_state_size);
-	uint8_t *buffer = malloc(buffer_size);
-	int audio_samples_per_block = psx_audio_spu_get_samples_per_block();
-	int block_size = psx_audio_spu_get_buffer_size_per_block();
-	int audio_samples_per_chunk = settings->interleave / block_size * audio_samples_per_block;
-	int chunk_count;
-
 	memset(audio_state, 0, audio_state_size);

-	if (settings->format == FORMAT_VAGI) {
-		fseek(output, header_size, SEEK_SET);
-	}
+	uint8_t *chunk = malloc(chunk_size);
+	int chunk_count = 0;

-	for (chunk_count = 0; ensure_av_data(settings, audio_samples_per_chunk*settings->channels, 0); chunk_count++) {
-		int samples_length = settings->audio_sample_count / settings->channels;
-		if (samples_length > audio_samples_per_chunk) samples_length = audio_samples_per_chunk;
+	for (; ensure_av_data(decoder, audio_samples_per_chunk * args->audio_channels, 0); chunk_count++) {
+		int samples_length = decoder->audio_sample_count / args->audio_channels;

-		for (int ch = 0; ch < settings->channels; ch++) {
-			memset(buffer, 0, buffer_size);
-			int length = psx_audio_spu_encode(audio_state + ch, settings->audio_samples + ch, samples_length, settings->channels, buffer);
-			if (length) {
-				//buffer[1] |= PSX_AUDIO_SPU_LOOP_START;
-				if (settings->loop) {
-					buffer[length - block_size + 1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
+		if (samples_length > audio_samples_per_chunk)
+			samples_length = audio_samples_per_chunk;
+
+		memset(chunk, 0, chunk_size);
+		uint8_t *chunk_ptr = chunk;
+
+		// Insert leading silent block
+		if (chunk_count == 0 && !(args->flags & FLAG_SPU_NO_LEADING_DUMMY)) {
+			chunk_ptr += PSX_AUDIO_SPU_BLOCK_SIZE;
+			samples_length -= PSX_AUDIO_SPU_SAMPLES_PER_BLOCK;
+		}
+
+		for (int ch = 0; ch < args->audio_channels; ch++, chunk_ptr += args->audio_interleave) {
+			int length = psx_audio_spu_encode(
+				audio_state + ch,
+				decoder->audio_samples + ch,
+				samples_length,
+				args->audio_channels,
+				chunk_ptr
+			);
+
+			if (length > 0) {
+				uint8_t *last_block = chunk_ptr + length - PSX_AUDIO_SPU_BLOCK_SIZE;
+
+				if (args->flags & FLAG_SPU_LOOP_END) {
+					last_block[1] = PSX_AUDIO_SPU_LOOP_REPEAT;
+				} else if (decoder->end_of_input) {
+					// HACK: the trailing block should in theory be appended to
+					// the existing data, but it's easier to just zerofill and
+					// repurpose the last encoded block
+					memset(last_block, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
+					last_block[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
 				}
-				if (settings->end_of_input) {
-					buffer[length - block_size + 1] |= PSX_AUDIO_SPU_LOOP_END;
-				}
-			}
-
-			fwrite(buffer, buffer_size, 1, output);
-
-			time_t t = get_elapsed_time(settings);
-			if (t) {
-				fprintf(stderr, "\rChunk: %6d | Encoding speed: %5.2fx",
-					chunk_count,
-					(double)(chunk_count*audio_samples_per_chunk) / (double)(settings->frequency*t)
-				);
 			}
 		}

-		retire_av_data(settings, samples_length*settings->channels, 0);
+		retire_av_data(decoder, samples_length * args->audio_channels, 0);
+		fwrite(chunk, chunk_size, 1, output);
+
+		time_t t = get_elapsed_time();
+
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rChunk: %6d | Encoding speed: %5.2fx",
+				chunk_count,
+				(double)(chunk_count * audio_samples_per_chunk) / (double)(args->audio_frequency * t)
+			);
+		}
+
 	}

-	if (settings->format == FORMAT_VAGI) {
+	free(audio_state);
+	free(chunk);
+
+	if (args->format == FORMAT_VAGI) {
 		uint8_t *header = malloc(header_size);
 		memset(header, 0, header_size);
-		write_vag_header(chunk_count*settings->interleave, header, settings);
+		write_vag_header(args, chunk_count * args->audio_interleave, header);
+
 		fseek(output, 0, SEEK_SET);
 		fwrite(header, header_size, 1, output);
 		free(header);
 	}
-
-	free(audio_state);
-	free(buffer);
 }

-void encode_file_xa(settings_t *settings, FILE *output) {
-	psx_audio_xa_settings_t xa_settings = settings_to_libpsxav_xa_audio(settings);
-	psx_audio_encoder_state_t audio_state;	
-	int audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
-	uint8_t buffer[2352];
-
-	memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));
-
-	for (int j = 0; ensure_av_data(settings, audio_samples_per_sector*settings->channels, 0); j++) {
-		int samples_length = settings->audio_sample_count / settings->channels;
-		if (samples_length > audio_samples_per_sector) samples_length = audio_samples_per_sector;
-		int length = psx_audio_xa_encode(xa_settings, &audio_state, settings->audio_samples, samples_length, buffer);
-		if (settings->end_of_input) {
-			psx_audio_xa_encode_finalize(xa_settings, buffer, length);
-		}
-
-		if (settings->format == FORMAT_XACD) {
-			int t = j + 75*2;
-
-			// Put the time in
-			buffer[0x00C] = ((t/75/60)%10)|(((t/75/60)/10)<<4);
-			buffer[0x00D] = (((t/75)%60)%10)|((((t/75)%60)/10)<<4);
-			buffer[0x00E] = ((t%75)%10)|(((t%75)/10)<<4);
-		}
-
-		retire_av_data(settings, samples_length*settings->channels, 0);
-		fwrite(buffer, length, 1, output);
-
-		time_t t = get_elapsed_time(settings);
-		if (t) {
-			fprintf(stderr, "\rLBA: %6d | Encoding speed: %5.2fx",
-				j,
-				(double)(j*audio_samples_per_sector) / (double)(settings->frequency*t)
-			);
-		}
-	}
-}
-
-void encode_file_str(settings_t *settings, FILE *output) {
-	psx_audio_xa_settings_t xa_settings = settings_to_libpsxav_xa_audio(settings);
-	psx_audio_encoder_state_t audio_state;
+void encode_file_str(const args_t *args, decoder_t *decoder, FILE *output) {
+	psx_audio_xa_settings_t xa_settings = args_to_libpsxav_xa_audio(args);
 	int sector_size = psx_audio_xa_get_buffer_size_per_sector(xa_settings);
-	int audio_samples_per_sector;
-	uint8_t buffer[2352];

 	int interleave;
+	int audio_samples_per_sector;
 	int video_sectors_per_block;
-	if (settings->decoder_state_av.audio_stream) {
+
+	if (decoder->state.audio_stream != NULL) {
 		// 1/N audio, (N-1)/N video
+		interleave = psx_audio_xa_get_sector_interleave(xa_settings) * args->str_cd_speed;
 		audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
-		interleave = psx_audio_xa_get_sector_interleave(xa_settings) * settings->cd_speed;
 		video_sectors_per_block = interleave - 1;
+
+		if (!(args->flags & FLAG_QUIET))
+			fprintf(
+				stderr,
+				"Interleave: %d/%d audio, %d/%d video\n",
+				interleave - video_sectors_per_block,
+				interleave,
+				video_sectors_per_block,
+				interleave
+			);
 	} else {
 		// 0/1 audio, 1/1 video
-		audio_samples_per_sector = 0;
 		interleave = 1;
+		audio_samples_per_sector = 0;
 		video_sectors_per_block = 1;
 	}

-	if (!settings->quiet) {
-		fprintf(stderr, "Interleave: %d/%d audio, %d/%d video\n",
-			interleave - video_sectors_per_block, interleave, video_sectors_per_block, interleave);
-	}
-
+	psx_audio_encoder_state_t audio_state;
 	memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));

-	// e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame
-	settings->state_vid.frame_block_base_overflow = (75*settings->cd_speed) * video_sectors_per_block * settings->video_fps_den;
-	settings->state_vid.frame_block_overflow_den = interleave * settings->video_fps_num;
-	double frame_size = (double)settings->state_vid.frame_block_base_overflow / (double)settings->state_vid.frame_block_overflow_den;
-	if (!settings->quiet) {
-		fprintf(stderr, "Frame size: %.2f sectors\n", frame_size);
-	}
+	mdec_encoder_t encoder;
+	init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height);

-	settings->state_vid.frame_output = malloc(2016 * (int)ceil(frame_size));
-	settings->state_vid.frame_index = 0;
-	settings->state_vid.frame_data_offset = 0;
-	settings->state_vid.frame_max_size = 0;
-	settings->state_vid.frame_block_overflow_num = 0;
-	settings->state_vid.quant_scale_sum = 0;
+	// e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame
+	encoder.state.frame_block_base_overflow = (75 * args->str_cd_speed) * video_sectors_per_block * args->str_fps_den;
+	encoder.state.frame_block_overflow_den = interleave * args->str_fps_num;
+	double frame_size = (double)encoder.state.frame_block_base_overflow / (double)encoder.state.frame_block_overflow_den;
+
+	if (!(args->flags & FLAG_QUIET))
+		fprintf(stderr, "Frame size: %.2f sectors\n", frame_size);
+
+	encoder.state.frame_output = malloc(2016 * (int)ceil(frame_size));
+	encoder.state.frame_index = 0;
+	encoder.state.frame_data_offset = 0;
+	encoder.state.frame_max_size = 0;
+	encoder.state.frame_block_overflow_num = 0;
+	encoder.state.quant_scale_sum = 0;

 	// FIXME: this needs an extra frame to prevent A/V desync
 	int frames_needed = (int) ceil((double)video_sectors_per_block / frame_size);
-	if (frames_needed < 2) frames_needed = 2;

-	for (int j = 0; !settings->end_of_input || settings->state_vid.frame_data_offset < settings->state_vid.frame_max_size; j++) {
-		ensure_av_data(settings, audio_samples_per_sector*settings->channels, frames_needed);
+	if (frames_needed < 2)
+		frames_needed = 2;

-		if ((j%interleave) < video_sectors_per_block) {
-			// Video sector
-			init_sector_buffer_video(buffer, settings);
-			encode_sector_str(settings->video_frames, buffer, settings);
+	int sector_count = 0;
+
+	for (; !decoder->end_of_input || encoder.state.frame_data_offset < encoder.state.frame_max_size; sector_count++) {
+		ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, frames_needed);
+
+		uint8_t sector[PSX_CDROM_SECTOR_SIZE];
+		bool is_video_sector;
+
+		if (audio_samples_per_sector == 0)
+			is_video_sector = true;
+		else if (args->flags & FLAG_STR_TRAILING_AUDIO)
+			is_video_sector = (sector_count % interleave) < video_sectors_per_block;
+		else
+			is_video_sector = (sector_count % interleave) > 0;
+
+		if (is_video_sector) {
+			init_sector_buffer_video(args, sector, sector_count);
+
+			int frames_used = encode_sector_str(
+				&encoder,
+				args->format,
+				args->str_video_id,
+				decoder->video_frames,
+				sector
+			);
+
+			psx_cdrom_calculate_checksums((psx_cdrom_sector_t *)sector, PSX_CDROM_SECTOR_TYPE_MODE2_FORM1);
+			retire_av_data(decoder, 0, frames_used);
 		} else {
-			// Audio sector
-			int samples_length = settings->audio_sample_count / settings->channels;
-			if (samples_length > audio_samples_per_sector) samples_length = audio_samples_per_sector;
+			int samples_length = decoder->audio_sample_count / args->audio_channels;
+
+			if (samples_length > audio_samples_per_sector)
+				samples_length = audio_samples_per_sector;

 			// FIXME: this is an extremely hacky way to handle audio tracks
 			// shorter than the video track
-			if (!samples_length) {
+			if (!samples_length)
 				video_sectors_per_block++;
-			}

-			int length = psx_audio_xa_encode(xa_settings, &audio_state, settings->audio_samples, samples_length, buffer);
-			if (settings->end_of_input) {
-				psx_audio_xa_encode_finalize(xa_settings, buffer, length);
-			}
-			retire_av_data(settings, samples_length*settings->channels, 0);
+			int length = psx_audio_xa_encode(
+				xa_settings,
+				&audio_state,
+				decoder->audio_samples,
+				samples_length,
+				sector_count,
+				sector
+			);
+
+			if (decoder->end_of_input)
+				psx_audio_xa_encode_finalize(xa_settings, sector, length);
+
+			retire_av_data(decoder, samples_length * args->audio_channels, 0);
 		}

-		if (settings->format == FORMAT_STR2CD) {
-			int t = j + 75*2;
+		fwrite(sector, sector_size, 1, output);

-			// Put the time in
-			buffer[0x00C] = ((t/75/60)%10)|(((t/75/60)/10)<<4);
-			buffer[0x00D] = (((t/75)%60)%10)|((((t/75)%60)/10)<<4);
-			buffer[0x00E] = ((t%75)%10)|(((t%75)/10)<<4);
+		time_t t = get_elapsed_time();

-			// FIXME: EDC is not calculated in 2336-byte sector mode (shouldn't
-			// matter anyway, any CD image builder will have to recalculate it
-			// due to the sector's MSF changing)
-			if((j%interleave) < video_sectors_per_block) {
-				calculate_edc_data(buffer);
-			}
-		}
-
-		fwrite(buffer, sector_size, 1, output);
-
-		time_t t = get_elapsed_time(settings);
-		if (t) {
-			fprintf(stderr, "\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
-				settings->state_vid.frame_index,
-				j,
-				(double)settings->state_vid.quant_scale_sum / (double)settings->state_vid.frame_index,
-				(double)(settings->state_vid.frame_index*settings->video_fps_den) / (double)(t*settings->video_fps_num)
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
+				encoder.state.frame_index,
+				sector_count,
+				(double)encoder.state.quant_scale_sum / (double)encoder.state.frame_index,
+				(double)(encoder.state.frame_index * args->str_fps_den) / (double)(t * args->str_fps_num)
 			);
 		}
 	}

-	free(settings->state_vid.frame_output);
+	free(encoder.state.frame_output);
+	destroy_mdec_encoder(&encoder);
 }

-void encode_file_sbs(settings_t *settings, FILE *output) {
-	settings->state_vid.frame_output = malloc(settings->alignment);
-	settings->state_vid.frame_data_offset = 0;
-	settings->state_vid.frame_max_size = settings->alignment;
-	settings->state_vid.quant_scale_sum = 0;
+void encode_file_strspu(const args_t *args, decoder_t *decoder, FILE *output) {
+	int interleave;
+	int audio_samples_per_sector;
+	int video_sectors_per_block;

-	for (int j = 0; ensure_av_data(settings, 0, 1); j++) {
-		encode_frame_bs(settings->video_frames, settings);
-		fwrite(settings->state_vid.frame_output, settings->alignment, 1, output);
+	if (decoder->state.audio_stream != NULL) {
+		assert(false); // TODO: implement

-		time_t t = get_elapsed_time(settings);
-		if (t) {
-			fprintf(stderr, "\rFrame: %4d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
-				j,
-				(double)settings->state_vid.quant_scale_sum / (double)j,
-				(double)(j*settings->video_fps_den) / (double)(t*settings->video_fps_num)
+		if (!(args->flags & FLAG_QUIET))
+			fprintf(
+				stderr,
+				"Interleave: %d/%d audio, %d/%d video\n",
+				interleave - video_sectors_per_block,
+				interleave,
+				video_sectors_per_block,
+				interleave
+			);
+	} else {
+		// 0/1 audio, 1/1 video
+		interleave = 1;
+		audio_samples_per_sector = 0;
+		video_sectors_per_block = 1;
+	}
+
+	mdec_encoder_t encoder;
+	init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height);
+
+	// e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame
+	encoder.state.frame_block_base_overflow = (75 * args->str_cd_speed) * video_sectors_per_block * args->str_fps_den;
+	encoder.state.frame_block_overflow_den = interleave * args->str_fps_num;
+	double frame_size = (double)encoder.state.frame_block_base_overflow / (double)encoder.state.frame_block_overflow_den;
+
+	if (!(args->flags & FLAG_QUIET))
+		fprintf(stderr, "Frame size: %.2f sectors\n", frame_size);
+
+	encoder.state.frame_output = malloc(2016 * (int)ceil(frame_size));
+	encoder.state.frame_index = 0;
+	encoder.state.frame_data_offset = 0;
+	encoder.state.frame_max_size = 0;
+	encoder.state.frame_block_overflow_num = 0;
+	encoder.state.quant_scale_sum = 0;
+
+	// FIXME: this needs an extra frame to prevent A/V desync
+	int frames_needed = (int) ceil((double)video_sectors_per_block / frame_size);
+
+	if (frames_needed < 2)
+		frames_needed = 2;
+
+	int sector_count = 0;
+
+	for (; !decoder->end_of_input || encoder.state.frame_data_offset < encoder.state.frame_max_size; sector_count++) {
+		ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, frames_needed);
+
+		uint8_t sector[2048];
+		bool is_video_sector;
+
+		if (audio_samples_per_sector == 0)
+			is_video_sector = true;
+		else if (args->flags & FLAG_STR_TRAILING_AUDIO)
+			is_video_sector = (sector_count % interleave) < video_sectors_per_block;
+		else
+			is_video_sector = (sector_count % interleave) > 0;
+
+		if (is_video_sector) {
+			init_sector_buffer_video(args, sector, sector_count);
+
+			int frames_used = encode_sector_str(
+				&encoder,
+				args->format,
+				args->str_video_id,
+				decoder->video_frames,
+				sector
+			);
+
+			retire_av_data(decoder, 0, frames_used);
+		} else {
+			int samples_length = decoder->audio_sample_count / args->audio_channels;
+
+			if (samples_length > audio_samples_per_sector)
+				samples_length = audio_samples_per_sector;
+
+			// FIXME: this is an extremely hacky way to handle audio tracks
+			// shorter than the video track
+			if (!samples_length)
+				video_sectors_per_block++;
+
+			assert(false); // TODO: implement
+
+			retire_av_data(decoder, samples_length * args->audio_channels, 0);
+		}
+
+		fwrite(sector, 2048, 1, output);
+
+		time_t t = get_elapsed_time();
+
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
+				encoder.state.frame_index,
+				sector_count,
+				(double)encoder.state.quant_scale_sum / (double)encoder.state.frame_index,
+				(double)(encoder.state.frame_index * args->str_fps_den) / (double)(t * args->str_fps_num)
 			);
 		}
 	}

-	free(settings->state_vid.frame_output);
+	free(encoder.state.frame_output);
+	destroy_mdec_encoder(&encoder);
+}
+
+void encode_file_sbs(const args_t *args, decoder_t *decoder, FILE *output) {
+	mdec_encoder_t encoder;
+	init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height);
+
+	encoder.state.frame_output = malloc(args->alignment);
+	encoder.state.frame_data_offset = 0;
+	encoder.state.frame_max_size = args->alignment;
+	encoder.state.quant_scale_sum = 0;
+
+	for (int j = 0; ensure_av_data(decoder, 0, 1); j++) {
+		encode_frame_bs(&encoder, decoder->video_frames);
+
+		retire_av_data(decoder, 0, 1);
+		fwrite(encoder.state.frame_output, args->alignment, 1, output);
+
+		time_t t = get_elapsed_time();
+
+		if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
+			fprintf(
+				stderr,
+				"\rFrame: %4d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
+				j,
+				(double)encoder.state.quant_scale_sum / (double)j,
+				(double)(j * args->str_fps_den) / (double)(t * args->str_fps_num)
+			);
+		}
+	}
+
+	free(encoder.state.frame_output);
+	destroy_mdec_encoder(&encoder);
 }
--- a/psxavenc/filefmt.h
+++ b/psxavenc/filefmt.h
@ -3,6 +3,7 @@ psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend

 Copyright (c) 2019, 2020 Adrian "asie" Siekierka
 Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023, 2025 spicyjpeg

 This software is provided 'as-is', without any express or implied
 warranty. In no event will the authors be held liable for any damages
@ -21,40 +22,15 @@ freely, subject to the following restrictions:
 3. This notice may not be removed or altered from any source distribution.
 */

-#include "common.h"
+#pragma once

-void init_sector_buffer_video(uint8_t *buffer, settings_t *settings) {
-	int offset;
-	if (settings->format == FORMAT_STR2CD) {
-		memset(buffer, 0, 2352);
-		memset(buffer+0x001, 0xFF, 10);
-		buffer[0x00F] = 0x02;
-		offset = 0x10;
-	} else {
-		memset(buffer, 0, 2336);
-		offset = 0;
-	}
+#include <stdio.h>
+#include "args.h"
+#include "decoding.h"

-	buffer[offset+0] = settings->file_number;
-	buffer[offset+1] = settings->channel_number & 0x1F;
-	buffer[offset+2] = 0x08 | 0x40;
-	buffer[offset+3] = 0x00;
-	memcpy(buffer + offset + 4, buffer + offset, 4);
-}
-
-void calculate_edc_data(uint8_t *buffer)
-{
-	uint32_t edc = 0;
-	for (int i = 0x010; i < 0x818; i++) {
-		edc ^= 0xFF&(uint32_t)buffer[i];
-		for (int ibit = 0; ibit < 8; ibit++) {
-			edc = (edc>>1)^(0xD8018001*(edc&0x1));
-		}
-	}
-	buffer[0x818] = (uint8_t)(edc);
-	buffer[0x819] = (uint8_t)(edc >> 8);
-	buffer[0x81A] = (uint8_t)(edc >> 16);
-	buffer[0x81B] = (uint8_t)(edc >> 24);
-
-	// TODO: ECC
-}
+void encode_file_xa(const args_t *args, decoder_t *decoder, FILE *output);
+void encode_file_spu(const args_t *args, decoder_t *decoder, FILE *output);
+void encode_file_spui(const args_t *args, decoder_t *decoder, FILE *output);
+void encode_file_str(const args_t *args, decoder_t *decoder, FILE *output);
+void encode_file_strspu(const args_t *args, decoder_t *decoder, FILE *output);
+void encode_file_sbs(const args_t *args, decoder_t *decoder, FILE *output);
--- a/psxavenc/main.c
+++ b/psxavenc/main.c
@ -0,0 +1,201 @@
+/*
+psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
+
+Copyright (c) 2019, 2020 Adrian "asie" Siekierka
+Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023 spicyjpeg
+
+This software is provided 'as-is', without any express or implied
+warranty. In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
+*/
+
+#include <stdint.h>
+#include <stdio.h>
+#include "args.h"
+#include "decoding.h"
+#include "filefmt.h"
+
+static const char *const bs_codec_names[NUM_BS_CODECS] = {
+	"BS v2",
+	"BS v3",
+	"BS v3 (with DC wrapping)"
+};
+
+static const uint8_t decoder_flags[NUM_FORMATS] = {
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // xa
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // xacd
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // spu
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // vag
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // spui
+	DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // vagi
+	DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // str
+	DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strcd
+	DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strspu
+	DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strv
+	DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED // sbs
+};
+
+int main(int argc, const char **argv) {
+	args_t args;
+	decoder_t decoder;
+	FILE *output;
+
+	args.flags = 0;
+
+	args.format = FORMAT_INVALID;
+	args.input_file = NULL;
+	args.output_file = NULL;
+	args.swresample_options = NULL;
+	args.swscale_options = NULL;
+
+	if (!parse_args(&args, argv + 1, argc - 1))
+		return 1;
+	if (!open_av_data(&decoder, &args, decoder_flags[args.format])) {
+		fprintf(stderr, "Failed to open input file: %s\n", args.input_file);
+		return 1;
+	}
+
+	output = fopen(args.output_file, "wb");
+
+	if (output == NULL) {
+		fprintf(stderr, "Failed to open output file: %s\n", args.output_file);
+		return 1;
+	}
+
+	switch (args.format) {
+		case FORMAT_XA:
+		case FORMAT_XACD:
+			if (!(args.flags & FLAG_QUIET))
+				fprintf(
+					stderr,
+					"Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
+					args.audio_frequency,
+					args.audio_bit_depth,
+					(args.audio_channels == 2) ? "stereo" : "mono",
+					args.audio_xa_file,
+					args.audio_xa_channel
+				);
+
+			encode_file_xa(&args, &decoder, output);
+			break;
+
+		case FORMAT_SPU:
+		case FORMAT_VAG:
+			if (!(args.flags & FLAG_QUIET))
+				fprintf(
+					stderr,
+					"Audio format: SPU-ADPCM, %d Hz mono\n",
+					args.audio_frequency
+				);
+
+			encode_file_spu(&args, &decoder, output);
+			break;
+
+		case FORMAT_SPUI:
+		case FORMAT_VAGI:
+			if (!(args.flags & FLAG_QUIET))
+				fprintf(
+					stderr,
+					"Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
+					args.audio_frequency,
+					args.audio_channels,
+					args.audio_interleave
+				);
+
+			encode_file_spui(&args, &decoder, output);
+			break;
+
+		case FORMAT_STR:
+		case FORMAT_STRCD:
+			if (!(args.flags & FLAG_QUIET)) {
+				if (decoder.state.audio_stream)
+					fprintf(
+						stderr,
+						"Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
+						args.audio_frequency,
+						args.audio_bit_depth,
+						(args.audio_channels == 2) ? "stereo" : "mono",
+						args.audio_xa_file,
+						args.audio_xa_channel
+					);
+
+				fprintf(
+					stderr,
+					"Video format: %s, %dx%d, %.2f fps\n",
+					bs_codec_names[args.video_codec],
+					args.video_width,
+					args.video_height,
+					(double)args.str_fps_num / (double)args.str_fps_den
+				);
+			}
+
+			encode_file_str(&args, &decoder, output);
+			break;
+
+		case FORMAT_STRSPU:
+			// TODO: implement and remove this check
+			fprintf(stderr, "This format is not currently supported\n");
+			break;
+
+		case FORMAT_STRV:
+			if (!(args.flags & FLAG_QUIET)) {
+				if (decoder.state.audio_stream)
+					fprintf(
+						stderr,
+						"Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
+						args.audio_frequency,
+						args.audio_channels,
+						args.audio_interleave
+					);
+
+				fprintf(
+					stderr,
+					"Video format: %s, %dx%d, %.2f fps\n",
+					bs_codec_names[args.video_codec],
+					args.video_width,
+					args.video_height,
+					(double)args.str_fps_num / (double)args.str_fps_den
+				);
+			}
+
+			encode_file_strspu(&args, &decoder, output);
+			break;
+
+		case FORMAT_SBS:
+			if (!(args.flags & FLAG_QUIET))
+				fprintf(
+					stderr,
+					"Video format: %s, %dx%d, %.2f fps\n",
+					bs_codec_names[args.video_codec],
+					args.video_width,
+					args.video_height,
+					(double)args.str_fps_num / (double)args.str_fps_den
+				);
+
+			encode_file_sbs(&args, &decoder, output);
+			break;
+
+		default:
+			;
+	}
+
+	if (!(args.flags & FLAG_HIDE_PROGRESS))
+		fprintf(stderr, "\nDone.\n");
+
+	fclose(output);
+	close_av_data(&decoder);
+	return 0;
+}
--- a/psxavenc/mdec.c
+++ b/psxavenc/mdec.c
--- a/psxavenc/mdec.h
+++ b/psxavenc/mdec.h
@ -0,0 +1,74 @@
+/*
+psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
+
+Copyright (c) 2019, 2020 Adrian "asie" Siekierka
+Copyright (c) 2019 Ben "GreaseMonkey" Russell
+Copyright (c) 2023, 2025 spicyjpeg
+
+This software is provided 'as-is', without any express or implied
+warranty. In no event will the authors be held liable for any damages
+arising from the use of this software.
+
+Permission is granted to anyone to use this software for any purpose,
+including commercial applications, and to alter it and redistribute it
+freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you must not
+   claim that you wrote the original software. If you use this software
+   in a product, an acknowledgment in the product documentation would be
+   appreciated but is not required.
+2. Altered source versions must be plainly marked as such, and must not be
+   misrepresented as being the original software.
+3. This notice may not be removed or altered from any source distribution.
+*/
+
+#pragma once
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <libavcodec/avdct.h>
+#include "args.h"
+
+typedef struct {
+	int frame_index;
+	int frame_data_offset;
+	int frame_max_size;
+	int frame_block_base_overflow;
+	int frame_block_overflow_num;
+	int frame_block_overflow_den;
+	int block_type;
+	int16_t last_dc_values[3];
+	uint16_t bits_value;
+	int bits_left;
+	uint8_t *frame_output;
+	int bytes_used;
+	int blocks_used;
+	int uncomp_hwords_used;
+	int quant_scale;
+	int quant_scale_sum;
+
+	AVDCT *dct_context;
+	uint32_t *ac_huffman_map;
+	uint32_t *dc_huffman_map;
+	int16_t *coeff_clamp_map;
+	int16_t *dct_block_lists[6];
+} mdec_encoder_state_t;
+
+typedef struct {
+	bs_codec_t video_codec;
+	int video_width;
+	int video_height;
+
+	mdec_encoder_state_t state;
+} mdec_encoder_t;
+
+bool init_mdec_encoder(mdec_encoder_t *encoder, bs_codec_t video_codec, int video_width, int video_height);
+void destroy_mdec_encoder(mdec_encoder_t *encoder);
+void encode_frame_bs(mdec_encoder_t *encoder, const uint8_t *video_frame);
+int encode_sector_str(
+	mdec_encoder_t *encoder,
+	format_t format,
+	uint16_t str_video_id,
+	const uint8_t *video_frames,
+	uint8_t *output
+);
--- a/psxavenc/psxavenc.c
+++ b/psxavenc/psxavenc.c
@ -1,417 +0,0 @@
-/*
-psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
-
-Copyright (c) 2019, 2020 Adrian "asie" Siekierka
-Copyright (c) 2019 Ben "GreaseMonkey" Russell
-Copyright (c) 2023 spicyjpeg
-
-This software is provided 'as-is', without any express or implied
-warranty. In no event will the authors be held liable for any damages
-arising from the use of this software.
-
-Permission is granted to anyone to use this software for any purpose,
-including commercial applications, and to alter it and redistribute it
-freely, subject to the following restrictions:
-
-1. The origin of this software must not be misrepresented; you must not
-   claim that you wrote the original software. If you use this software
-   in a product, an acknowledgment in the product documentation would be
-   appreciated but is not required.
-2. Altered source versions must be plainly marked as such, and must not be
-   misrepresented as being the original software.
-3. This notice may not be removed or altered from any source distribution.
-*/
-
-#include "common.h"
-
-const char *format_names[NUM_FORMATS] = {
-	"xa", "xacd",
-	"spu", "spui",
-	"vag", "vagi",
-	"str2", "str2cd",
-	"sbs2"
-};
-
-void print_help(void) {
-	fprintf(stderr,
-		"Usage:\n"
-		"    psxavenc -t <xa|xacd>     [-f 18900|37800] [-b 4|8] [-c 1|2] [-F 0-255] [-C 0-31] <in> <out.xa>\n"
-		"    psxavenc -t <str2|str2cd> [-f 18900|37800] [-b 4|8] [-c 1|2] [-F 0-255] [-C 0-31] [-s WxH] [-I] [-r num/den] [-x 1|2] <in> <out.str>\n"
-		"    psxavenc -t sbs2          [-s WxH] [-I] [-r num/den] [-a size] <in> <out.str>\n"
-		"    psxavenc -t <spu|vag>     [-f freq] [-L] <in> <out.vag>\n"
-		"    psxavenc -t <spui|vagi>   [-f freq] [-c 1-24] [-L] [-i size] [-a size] <in> <out.vag>\n"
-		"\nTool options:\n"
-		"    -h               Show this help message and exit\n"
-		"    -q               Suppress all non-error messages\n"
-		"\nOutput options:\n"
-		"    -t format        Use specified output type:\n"
-		"                       xa     [A.] .xa, 2336-byte sectors\n"
-		"                       xacd   [A.] .xa, 2352-byte sectors\n"
-		"                       spu    [A.] raw SPU-ADPCM mono data\n"
-		"                       spui   [A.] raw SPU-ADPCM interleaved data\n"
-		"                       vag    [A.] .vag SPU-ADPCM mono\n"
-		"                       vagi   [A.] .vag SPU-ADPCM interleaved\n"
-		"                       str2   [AV] v2 .str video, 2336-byte sectors\n"
-		"                       str2cd [AV] v2 .str video, 2352-byte sectors\n"
-		"                       sbs2   [.V] v2 .sbs video, 2048-byte sectors\n"
-		"    -F num           Set the XA file number for xa/str2 (0-255)\n"
-		"    -C num           Set the XA channel number for xa/str2 (0-31)\n"
-		"\nAudio options:\n"
-		"    -f freq          Use specified sample rate (must be 18900 or 37800 for xa/str2)\n"
-		"    -b bitdepth      Use specified bit depth for xa/str2 (4 or 8)\n"
-		"    -c channels      Use specified channel count (1-2 for xa/str2, any for spui/vagi)\n"
-		"    -L               Add a loop marker at the end of SPU-ADPCM data\n"
-		"    -R key=value,... Pass custom options to libswresample (see ffmpeg docs)\n"
-		"\nSPU interleaving options (spui/vagi format):\n"
-		"    -i size          Use specified interleave\n"
-		"    -a size          Pad header and each interleaved chunk to specified size\n"
-		"\nVideo options (str2/str2cd/sbs2 format):\n"
-		"    -s WxH           Rescale input file to fit within specified size (default 320x240)\n"
-		"    -I               Force stretching to given size without preserving aspect ratio\n"
-		"    -S key=value,... Pass custom options to libswscale (see ffmpeg docs)\n"
-		"    -r num/den       Set frame rate to specified integer or fraction (default 15)\n"
-		"    -x speed         Set the CD-ROM speed the file is meant to played at (1-2)\n"
-		"    -a size          Set the size of each frame for sbs2\n"
-	);
-}
-
-int parse_args(settings_t* settings, int argc, char** argv) {
-	int c, i;
-	char *next;
-	while ((c = getopt(argc, argv, "?hqt:F:C:f:b:c:LR:i:a:s:IS:r:x:")) != -1) {
-		switch (c) {
-			case '?':
-			case 'h': {
-				print_help();
-				return -1;
-			} break;
-			case 'q': {
-				settings->quiet = true;
-				settings->show_progress = false;
-			} break;
-			case 't': {
-				settings->format = -1;
-				for (i = 0; i < NUM_FORMATS; i++) {
-					if (!strcmp(optarg, format_names[i])) {
-						settings->format = i;
-						break;
-					}
-				}
-				if (settings->format < 0) {
-					fprintf(stderr, "Invalid format: %s\n", optarg);
-					return -1;
-				}
-			} break;
-			case 'F': {
-				settings->file_number = strtol(optarg, NULL, 0);
-				if (settings->file_number < 0 || settings->file_number > 255) {
-					fprintf(stderr, "Invalid file number: %d\n", settings->file_number);
-					return -1;
-				}
-			} break;
-			case 'C': {
-				settings->channel_number = strtol(optarg, NULL, 0);
-				if (settings->channel_number < 0 || settings->channel_number > 31) {
-					fprintf(stderr, "Invalid channel number: %d\n", settings->channel_number);
-					return -1;
-				}
-			} break;
-			case 'f': {
-				settings->frequency = strtol(optarg, NULL, 0);
-			} break;
-			case 'b': {
-				settings->bits_per_sample = strtol(optarg, NULL, 0);
-				if (settings->bits_per_sample != 4 && settings->bits_per_sample != 8) {
-					fprintf(stderr, "Invalid bit depth: %d\n", settings->frequency);
-					return -1;
-				}
-			} break;
-			case 'c': {
-				settings->channels = strtol(optarg, NULL, 0);
-				if (settings->channels < 1 || settings->channels > 24) {
-					fprintf(stderr, "Invalid channel count: %d\n", settings->channels);
-					return -1;
-				}
-			} break;
-			case 'L': {
-				settings->loop = true;
-			} break;
-			case 'R': {
-				settings->swresample_options = optarg;
-			} break;
-			case 'i': {
-				settings->interleave = (strtol(optarg, NULL, 0) + 15) & ~15;
-				if (settings->interleave < 16) {
-					fprintf(stderr, "Invalid interleave: %d\n", settings->interleave);
-					return -1;
-				}
-			} break;
-			case 'a': {
-				settings->alignment = strtol(optarg, NULL, 0);
-				if (settings->alignment < 1) {
-					fprintf(stderr, "Invalid alignment: %d\n", settings->alignment);
-					return -1;
-				}
-			} break;
-			case 's': {
-				settings->video_width = (strtol(optarg, &next, 0) + 15) & ~15;
-				if (*next != 'x') {
-					fprintf(stderr, "Invalid video size (must be specified as <width>x<height>)\n");
-					return -1;
-				}
-				settings->video_height = (strtol(next + 1, NULL, 0) + 15) & ~15;
-
-				if (settings->video_width < 16 || settings->video_width > 320) {
-					fprintf(stderr, "Invalid video width: %d\n", settings->video_width);
-					return -1;
-				}
-				if (settings->video_height < 16 || settings->video_height > 240) {
-					fprintf(stderr, "Invalid video height: %d\n", settings->video_height);
-					return -1;
-				}
-			} break;
-			case 'I': {
-				settings->ignore_aspect_ratio = true;
-			} break;
-			case 'S': {
-				settings->swscale_options = optarg;
-			} break;
-			case 'r': {
-				settings->video_fps_num = strtol(optarg, &next, 0);
-				if (*next == '/') {
-					settings->video_fps_den = strtol(next + 1, NULL, 0);
-				} else {
-					settings->video_fps_den = 1;
-				}
-
-				if (!settings->video_fps_den) {
-					fprintf(stderr, "Invalid frame rate denominator\n");
-					return -1;
-				}
-				i = settings->video_fps_num / settings->video_fps_den;
-				if (i < 1 || i > 30) {
-					fprintf(stderr, "Invalid frame rate: %d/%d\n", settings->video_fps_num, settings->video_fps_den);
-					return -1;
-				}
-			} break;
-			case 'x': {
-				settings->cd_speed = strtol(optarg, NULL, 0);
-				if (settings->cd_speed < 1 || settings->cd_speed > 2) {
-					fprintf(stderr, "Invalid CD-ROM speed: %d\n", settings->cd_speed);
-					return -1;
-				}
-			} break;
-		}
-	}
-
-	// Validate settings
-	switch (settings->format) {
-		case FORMAT_XA:
-		case FORMAT_XACD:
-		case FORMAT_STR2:
-		case FORMAT_STR2CD:
-			if (settings->frequency != PSX_AUDIO_XA_FREQ_SINGLE && settings->frequency != PSX_AUDIO_XA_FREQ_DOUBLE) {
-				fprintf(
-					stderr, "Invalid XA-ADPCM frequency: %d Hz (must be %d or %d Hz)\n", settings->frequency,
-					PSX_AUDIO_XA_FREQ_SINGLE, PSX_AUDIO_XA_FREQ_DOUBLE
-				);
-				return -1;
-			}
-			if (settings->channels > 2) {
-				fprintf(stderr, "Invalid XA-ADPCM channel count: %d (must be 1 or 2)\n", settings->channels);
-				return -1;
-			}
-			if (settings->loop) {
-				fprintf(stderr, "XA-ADPCM does not support loop markers\n");
-				return -1;
-			}
-			break;
-		case FORMAT_SPU:
-		case FORMAT_VAG:
-			if (settings->bits_per_sample != 4) {
-				fprintf(stderr, "Invalid SPU-ADPCM bit depth: %d (must be 4)\n", settings->bits_per_sample);
-				return -1;
-			}
-			if (settings->channels != 1) {
-				fprintf(stderr, "Invalid SPU-ADPCM channel count: %d (must be 1)\n", settings->channels);
-				return -1;
-			}
-			if (settings->interleave) {
-				fprintf(stderr, "Interleave cannot be specified for mono SPU-ADPCM\n");
-				return -1;
-			}
-			break;
-		case FORMAT_SPUI:
-		case FORMAT_VAGI:
-			if (settings->bits_per_sample != 4) {
-				fprintf(stderr, "Invalid SPU-ADPCM bit depth: %d (must be 4)\n", settings->bits_per_sample);
-				return -1;
-			}
-			if (!settings->interleave) {
-				fprintf(stderr, "Interleave must be specified for interleaved SPU-ADPCM\n");
-				return -1;
-			}
-			break;
-		case FORMAT_SBS2:
-			if (!settings->alignment) {
-				fprintf(stderr, "Alignment (frame size) must be specified\n");
-				return -1;
-			}
-			if (settings->alignment < 256) {
-				fprintf(stderr, "Invalid frame size: %d (must be at least 256)\n", settings->alignment);
-				return -1;
-			}
-			break;
-		default:
-			fprintf(stderr, "Output format must be specified\n");
-			return -1;
-	}
-
-	return optind;
-}
-
-int main(int argc, char **argv) {
-	settings_t settings;
-	int arg_offset;
-	FILE* output;
-
-	memset(&settings,0,sizeof(settings_t));
-
-	settings.quiet = false;
-	settings.show_progress = isatty(fileno(stderr));
-
-	settings.format = -1;
-	settings.file_number = 0;
-	settings.channel_number = 0;
-	settings.cd_speed = 2;
-	settings.channels = 1;
-	settings.frequency = PSX_AUDIO_XA_FREQ_DOUBLE;
-	settings.bits_per_sample = 4;
-	settings.interleave = 0;
-	settings.alignment = 2048;
-	settings.loop = false;
-
-	// NOTE: ffmpeg/ffplay's .str demuxer has the frame rate hardcoded to 15fps
-	// so if you're messing around with this make sure you test generated files
-	// with another player and/or in an emulator.
-	settings.video_width = 320;
-	settings.video_height = 240;
-	settings.video_fps_num = 15;
-	settings.video_fps_den = 1;
-	settings.ignore_aspect_ratio = false;
-
-	settings.swresample_options = NULL;
-	settings.swscale_options = NULL;
-
-	settings.audio_samples = NULL;
-	settings.audio_sample_count = 0;
-	settings.video_frames = NULL;
-	settings.video_frame_count = 0;
-
-	for(int i = 0; i < 6; i++) {
-		settings.state_vid.dct_block_lists[i] = NULL;
-	}
-
-	if (argc < 2) {
-		print_help();
-		return 1;
-	}
-
-	arg_offset = parse_args(&settings, argc, argv);
-	if (arg_offset < 0) {
-		return 1;
-	} else if (argc < arg_offset + 2) {
-		print_help();
-		return 1;
-	}
-
-	bool has_audio = (settings.format != FORMAT_SBS2);
-	bool has_video = (settings.format == FORMAT_STR2) ||
-		(settings.format == FORMAT_STR2CD) || (settings.format == FORMAT_SBS2);
-
-	bool did_open_data = open_av_data(argv[arg_offset + 0], &settings,
-		has_audio, has_video, !has_video, has_video);
-	if (!did_open_data) {
-		fprintf(stderr, "Could not open input file!\n");
-		return 1;
-	}
-
-	output = fopen(argv[arg_offset + 1], "wb");
-	if (output == NULL) {
-		fprintf(stderr, "Could not open output file!\n");
-		return 1;
-	}
-
-	settings.start_time = time(NULL);
-	settings.last_progress_update = 0;
-
-	switch (settings.format) {
-		case FORMAT_XA:
-		case FORMAT_XACD:
-			if (!settings.quiet) {
-				fprintf(stderr, "Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
-					settings.frequency, settings.bits_per_sample,
-					(settings.channels == 2) ? "stereo" : "mono",
-					settings.file_number, settings.channel_number
-				);
-			}
-
-			encode_file_xa(&settings, output);
-			break;
-		case FORMAT_SPU:
-		case FORMAT_VAG:
-			if (!settings.quiet) {
-				fprintf(stderr, "Audio format: SPU-ADPCM, %d Hz mono\n",
-					settings.frequency
-				);
-			}
-
-			encode_file_spu(&settings, output);
-			break;
-		case FORMAT_SPUI:
-		case FORMAT_VAGI:
-			if (!settings.quiet) {
-				fprintf(stderr, "Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
-					settings.frequency, settings.channels, settings.interleave
-				);
-			}
-
-			encode_file_spu_interleaved(&settings, output);
-			break;
-		case FORMAT_STR2:
-		case FORMAT_STR2CD:
-			if (!settings.quiet) {
-				if (settings.decoder_state_av.audio_stream) {
-					fprintf(stderr, "Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
-						settings.frequency, settings.bits_per_sample,
-						(settings.channels == 2) ? "stereo" : "mono",
-						settings.file_number, settings.channel_number
-					);
-				}
-				fprintf(stderr, "Video format: BS v2, %dx%d, %.2f fps\n",
-					settings.video_width, settings.video_height,
-					(double)settings.video_fps_num / (double)settings.video_fps_den
-				);
-			}
-
-			encode_file_str(&settings, output);
-			break;
-		case FORMAT_SBS2:
-			if (!settings.quiet) {
-				fprintf(stderr, "Video format: BS v2, %dx%d, %.2f fps\n",
-					settings.video_width, settings.video_height,
-					(double)settings.video_fps_num / (double)settings.video_fps_den
-				);
-			}
-
-			encode_file_sbs(&settings, output);
-			break;
-	}
-
-	if (settings.show_progress) {
-		fprintf(stderr, "\nDone.\n");
-	}
-	fclose(output);
-	close_av_data(&settings);
-	return 0;
-}
Author	SHA1	Message	Date
Adrian Siekierka	ed4821fcac	chore: update CI ffmpeg to 7.1.1	2025-03-08 07:38:56 +01:00
Adrian Siekierka	96cc5fc5c0	Merge pull request #8 from spicyjpeg/bs-v3-update BS v3 support, SPU-ADPCM loop flag fixes and general refactoring	2025-03-08 07:21:49 +01:00
spicyjpeg	801d70e22e	Disable unimplemented formats, add missing const qualifiers	2025-03-08 01:10:42 +01:00
spicyjpeg	60cbaca2b2	Fix str subheader corruption, update README	2025-03-05 01:32:35 +01:00
spicyjpeg	24d37145c6	Bugfixes, add -T and -A options	2025-03-02 20:15:06 +01:00
spicyjpeg	7d537edffb	Clean up, implement new SPU-ADPCM looping options	2025-03-02 12:12:51 +01:00
spicyjpeg	4a0d0c55fd	Add BS v3 encoding support	2025-02-28 11:42:23 +01:00
spicyjpeg	a39f159aaf	Refactor and get rid of common.h	2025-02-28 02:15:21 +01:00
spicyjpeg	7b5953322f	Add new argument parser	2025-02-28 01:26:41 +01:00
spicyjpeg	982fad256e	Add .editorconfig, .gitignore and FFmpeg deprecation note	2025-02-25 18:54:53 +01:00
Adrian Siekierka	b6c8a1c7b6	chore: enable LTO in CI builds	2025-02-16 16:29:34 +01:00
Adrian Siekierka	744aab19e5	Add str2v format	2025-02-16 16:09:17 +01:00
spicyjpeg	302989badf	Optimize MDEC encoder, use ffmpeg DCT implementation	2025-02-16 16:00:08 +01:00
Adrian Siekierka	87b0fe3f2a	libpsxav: fix EOF flag being misapplied in xa/xacd output (#7 ) * libpsxav: fix EOF flag being misapplied in xa/xacd output * psxavenc: refactor str2/str2cd output to match xa/xacd output cleanup * psxavenc: fix EDC not being calculated in 2336-byte sector mode	2025-02-16 16:00:41 +01:00
Adrian Siekierka	3478a92512	chore: fix GitHub CI	2025-02-16 15:56:09 +01:00
Adrian Siekierka	8712e4e9fd	Work around SPU loop flag update bug Co-authored-by: spicyjpeg <thatspicyjpeg@gmail.com>	2025-02-16 15:34:29 +01:00
Adrian Siekierka	145aa5ac65	add version information	2025-02-16 15:32:42 +01:00
spicyjpeg	f127a72f11	Add -a option for spu/vag formats, better defaults	2025-02-16 15:23:31 +01:00
Adrian Siekierka	bea15ca01f	increase permissible frame rate range to 1..60 FPS	2025-02-15 22:56:57 +01:00
Adrian Siekierka	f8e44a59c3	re-enable colorspace conversion	2025-02-15 22:52:17 +01:00
Adrian Siekierka	703bb67393	chore: update ffmpeg in Windows build to 7.1	2025-02-15 22:45:53 +01:00
Adrian Siekierka	ac8280cd85	chore: fix GitHub CI	2025-02-15 22:44:33 +01:00