Compare commits

..

No commits in common. "main" and "v0.2.0" have entirely different histories.
main ... v0.2.0

20 changed files with 1606 additions and 2964 deletions

View File

@ -1,9 +0,0 @@
root = true
[*]
indent_style = tab
indent_size = 4
charset = utf-8
end_of_line = lf
trim_trailing_whitespace = true
insert_final_newline = true

View File

@ -1,7 +1,7 @@
#!/bin/bash #!/bin/bash
ROOT_DIR="$(pwd)" ROOT_DIR="$(pwd)"
FFMPEG_VERSION="7.1.1" FFMPEG_VERSION="6.0"
NUM_JOBS="4" NUM_JOBS="4"
if [ $# -eq 1 ]; then if [ $# -eq 1 ]; then
@ -67,7 +67,6 @@ rm -rf ffmpeg-build
meson setup \ meson setup \
--buildtype release \ --buildtype release \
-Db_lto=true \
--strip \ --strip \
--prefix $ROOT_DIR/psxavenc-dist \ --prefix $ROOT_DIR/psxavenc-dist \
--pkg-config-path $ROOT_DIR/ffmpeg-dist/lib/pkgconfig \ --pkg-config-path $ROOT_DIR/ffmpeg-dist/lib/pkgconfig \

View File

@ -17,14 +17,13 @@ jobs:
uses: actions/checkout@v3 uses: actions/checkout@v3
with: with:
path: psxavenc path: psxavenc
fetch-depth: 0
- name: Build psxavenc for Windows - name: Build psxavenc for Windows
run: | run: |
psxavenc/.github/scripts/build.sh psxavenc-windows x86_64-w64-mingw32 psxavenc/.github/scripts/mingw-cross.txt psxavenc/.github/scripts/build.sh psxavenc-windows x86_64-w64-mingw32 psxavenc/.github/scripts/mingw-cross.txt
- name: Upload Windows build artifacts - name: Upload Windows build artifacts
uses: actions/upload-artifact@v4 uses: actions/upload-artifact@v3
with: with:
name: psxavenc-windows name: psxavenc-windows
path: psxavenc-windows.zip path: psxavenc-windows.zip
@ -34,7 +33,7 @@ jobs:
psxavenc/.github/scripts/build.sh psxavenc-linux psxavenc/.github/scripts/build.sh psxavenc-linux
- name: Upload Linux build artifacts - name: Upload Linux build artifacts
uses: actions/upload-artifact@v4 uses: actions/upload-artifact@v3
with: with:
name: psxavenc-linux name: psxavenc-linux
path: psxavenc-linux.zip path: psxavenc-linux.zip

6
.gitignore vendored
View File

@ -1,6 +0,0 @@
desktop.ini
.DS_Store
.vscode/
build/
.cache/
*.code-workspace

View File

@ -2,7 +2,7 @@
# psxavenc # psxavenc
psxavenc is an open-source command-line tool for encoding audio and video data psxavenc is an open-source command-line tool for encoding audio and video data
into formats commonly used on the original PlayStation and PlayStation 2. into formats commonly used on the original PlayStation.
## Installation ## Installation
@ -14,22 +14,22 @@ Requirements:
```shell ```shell
$ meson setup build $ meson setup build
$ meson compile -C build $ cd build
$ meson install -C build $ ninja install
``` ```
## Usage ## Usage
Run `psxavenc -h`. Run `psxavenc`.
### Examples ### Examples
Rescale a video file to ≤320x240 pixels (preserving aspect ratio) and encode it Rescale a video file to ≤320x240 pixels (preserving aspect ratio) and encode it
into a 15 fps version 2 .str file with 37800 Hz 4-bit stereo audio and 2352-byte into a 15fps .STR file with 37800 Hz 4-bit stereo audio and 2352-byte sectors,
sectors, meant to be played at 2x CD-ROM speed: meant to be played at 2x CD-ROM speed:
```shell ```shell
$ psxavenc -t strcd -v v2 -f 37800 -b 4 -c 2 -s 320x240 -r 15 -x 2 in.mp4 out.str $ psxavenc -t str2cd -f 37800 -b 4 -c 2 -s 320x240 -r 15 -x 2 in.mp4 out.str
``` ```
Convert a mono audio sample to 22050 Hz raw SPU-ADPCM data: Convert a mono audio sample to 22050 Hz raw SPU-ADPCM data:
@ -38,77 +38,35 @@ Convert a mono audio sample to 22050 Hz raw SPU-ADPCM data:
$ psxavenc -t spu -f 22050 in.ogg out.snd $ psxavenc -t spu -f 22050 in.ogg out.snd
``` ```
Convert a stereo audio file to a 44100 Hz interleaved .vag file with 2048-byte Convert a stereo audio file to a 44100 Hz interleaved .VAG file with 8192-byte
interleave and loop flags set at the end of each interleaved chunk: interleave and loop flags set at the end of each interleaved chunk:
```shell ```shell
$ psxavenc -t vagi -f 44100 -c 2 -L -i 2048 in.wav out.vag $ psxavenc -t vagi -f 44100 -c 2 -L -i 8192 in.wav out.vag
``` ```
## Supported output formats ## Supported formats
The output format must be set using the `-t` option. | Format | Audio | Channels | Video | Sector size |
| :------- | :--------------- | :------- | :---- | :---------- |
| Format | Audio codec | Audio channels | Video codec | Sector size | | `xa` | XA-ADPCM | 1 or 2 | None | 2336 bytes |
| :------ | :------------------- | :------------- | :------------ | :---------- | | `xacd` | XA-ADPCM | 1 or 2 | None | 2352 bytes |
| `xa` | XA-ADPCM | 1 or 2 | | 2336 bytes | | `spu` | SPU-ADPCM | 1 | None | |
| `xacd` | XA-ADPCM | 1 or 2 | | 2352 bytes | | `spui` | SPU-ADPCM | Any | None | Any |
| `spu` | SPU-ADPCM | 1 | | | | `vag` | SPU-ADPCM | 1 | None | |
| `vag` | SPU-ADPCM | 1 | | | | `vagi` | SPU-ADPCM | Any | None | Any |
| `spui` | SPU-ADPCM | Any | | | | `str2` | None or XA-ADPCM | 1 or 2 | BS v2 | 2336 bytes |
| `vagi` | SPU-ADPCM | Any | | | | `str2cd` | None or XA-ADPCM | 1 or 2 | BS v2 | 2352 bytes |
| `str` | XA-ADPCM (optional) | 1 or 2 | BS v2/v3/v3dc | 2336 bytes | | `sbs2` | None | | BS v2 | Any |
| `strcd` | XA-ADPCM (optional) | 1 or 2 | BS v2/v3/v3dc | 2352 bytes |
| `strv` | | | BS v2/v3/v3dc | 2048 bytes |
| `sbs` | | | BS v2/v3/v3dc | |
Notes: Notes:
- The `xa`, `xacd`, `str` and `strcd` formats will output files with 2336- or - `vag` and `vagi` are similar to `spu` and `spui` respectively, but add a .VAG
2352-byte CD-ROM sectors, containing the appropriate CD-XA subheaders and
dummy EDC/ECC placeholders in addition to the actual sector data. Such files
**cannot be added to a disc image as-is** and must instead be parsed by an
authoring tool capable of rebuilding the EDC/ECC data (as it is dependent on
the file's absolute location on the disc) and generating a Mode 2 CD-ROM image
with "native" 2352-byte sectors.
- Similarly, files generated with `-t xa` or `-t xacd` **must be interleaved**
**with other XA-ADPCM tracks or empty padding using an external tool** before
they can be played.
- `vag` and `vagi` are similar to `spu` and `spui` respectively, but add a .vag
header at the beginning of the file. The header is always 48 bytes long for header at the beginning of the file. The header is always 48 bytes long for
`vag` files, while in the case of `vagi` files it is padded to the size `vag` files, while in the case of `vagi` files it is padded to the size
specified using the `-a` option (2048 bytes by default). Note that `vagi` specified using the `-a` option (2048 bytes by default). Note that `vagi`
files with more than 2 channels and/or alignment other than 2048 bytes are not files with more than 2 channels and/or alignment other than 2048 bytes are not
standardized. standardized.
- ~~The `strspu` format encodes the input file's audio track as a series of~~ - The `sbs2` format (used in some System 573 games) is simply a series of
~~custom .str chunks (type ID `0x0001` by default) holding interleaved~~ concatenated BS v2 frames, each padded to the size specified by the `-a`
~~SPU-ADPCM data in the same format as `spui`, rather than XA-ADPCM. As .str~~ option, with no additional headers besides the BS frame headers.
~~chunks do not require custom XA subheaders, a file with standard 2048-byte~~
~~sectors that does not need any special handling will be generated.~~ *This*
*format has not yet been implemented.*
- The `strv` format disables audio altogether and is equivalent to `strspu` on
an input file with no audio track.
- The `sbs` format (used in some System 573 games) consists of a series of
concatenated BS frames, each padded to the size specified by the `-a` option
(the default setting is 8192 bytes), with no additional headers besides the BS
frame headers.
## Supported video codecs
All formats with a video track (`str`, `strcd`, `strv` and `sbs`) can use any of
the codecs listed below. The codec can be set using the `-v` option.
| Codec | Supported by | Typ. decoder CPU usage |
| :------------- | :-------------------- | :--------------------- |
| `v2` (default) | All players/decoders | Medium |
| `v3` | Most players/decoders | High |
| `v3dc` | Few players/decoders | High |
Notes:
- The `v3dc` format is a variant of `v3` with a slightly better compression
ratio, however most tools and playback libraries (including FFmpeg, jPSXdec
and earlier versions of Sony's own BS decoder) are unable to decode it
correctly; its use is thus highly discouraged. Refer to
[the psx-spx section on DC coefficient encoding](https://psx-spx.consoledev.net/cdromfileformats/#dc-v3)
for more details.

View File

@ -29,21 +29,14 @@ freely, subject to the following restrictions:
#define SHIFT_RANGE_4BPS 12 #define SHIFT_RANGE_4BPS 12
#define SHIFT_RANGE_8BPS 8 #define SHIFT_RANGE_8BPS 8
#define ADPCM_FILTER_COUNT 5 #define ADPCM_FILTER_COUNT 5
#define XA_ADPCM_FILTER_COUNT 4 #define XA_ADPCM_FILTER_COUNT 4
#define SPU_ADPCM_FILTER_COUNT 5 #define SPU_ADPCM_FILTER_COUNT 5
static const int16_t filter_k1[ADPCM_FILTER_COUNT] = {0, 60, 115, 98, 122}; static const int16_t filter_k1[ADPCM_FILTER_COUNT] = {0, 60, 115, 98, 122};
static const int16_t filter_k2[ADPCM_FILTER_COUNT] = {0, 0, -52, -55, -60}; static const int16_t filter_k2[ADPCM_FILTER_COUNT] = {0, 0, -52, -55, -60};
static int find_min_shift( static int find_min_shift(const psx_audio_encoder_channel_state_t *state, int16_t *samples, int sample_limit, int pitch, int filter, int shift_range) {
const psx_audio_encoder_channel_state_t *state,
const int16_t *samples,
int sample_limit,
int pitch,
int filter,
int shift_range
) {
// Assumption made: // Assumption made:
// //
// There is value in shifting right one step further to allow the nibbles to clip. // There is value in shifting right one step further to allow the nibbles to clip.
@ -61,7 +54,7 @@ static int find_min_shift(
int32_t s_min = 0; int32_t s_min = 0;
int32_t s_max = 0; int32_t s_max = 0;
for (int i = 0; i < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; i++) { for (int i = 0; i < 28; i++) {
int32_t raw_sample = (i >= sample_limit) ? 0 : samples[i * pitch]; int32_t raw_sample = (i >= sample_limit) ? 0 : samples[i * pitch];
int32_t previous_values = (k1*prev1 + k2*prev2 + (1<<5))>>6; int32_t previous_values = (k1*prev1 + k2*prev2 + (1<<5))>>6;
int32_t sample = raw_sample - previous_values; int32_t sample = raw_sample - previous_values;
@ -78,19 +71,7 @@ static int find_min_shift(
return min_shift; return min_shift;
} }
static uint8_t attempt_to_encode( static uint8_t attempt_to_encode(psx_audio_encoder_channel_state_t *outstate, const psx_audio_encoder_channel_state_t *instate, int16_t *samples, int sample_limit, int pitch, uint8_t *data, int data_shift, int data_pitch, int filter, int sample_shift, int shift_range) {
psx_audio_encoder_channel_state_t *outstate,
const psx_audio_encoder_channel_state_t *instate,
const int16_t *samples,
int sample_limit,
int pitch,
uint8_t *data,
int data_shift,
int data_pitch,
int filter,
int sample_shift,
int shift_range
) {
uint8_t sample_mask = 0xFFFF >> shift_range; uint8_t sample_mask = 0xFFFF >> shift_range;
uint8_t nondata_mask = ~(sample_mask << data_shift); uint8_t nondata_mask = ~(sample_mask << data_shift);
@ -106,7 +87,7 @@ static uint8_t attempt_to_encode(
outstate->mse = 0; outstate->mse = 0;
for (int i = 0; i < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; i++) { for (int i = 0; i < 28; i++) {
int32_t sample = ((i >= sample_limit) ? 0 : samples[i * pitch]) + outstate->qerr; int32_t sample = ((i >= sample_limit) ? 0 : samples[i * pitch]) + outstate->qerr;
int32_t previous_values = (k1*outstate->prev1 + k2*outstate->prev2 + (1<<5))>>6; int32_t previous_values = (k1*outstate->prev1 + k2*outstate->prev2 + (1<<5))>>6;
int32_t sample_enc = sample - previous_values; int32_t sample_enc = sample - previous_values;
@ -139,18 +120,8 @@ static uint8_t attempt_to_encode(
return hdr; return hdr;
} }
static uint8_t encode( static uint8_t encode(psx_audio_encoder_channel_state_t *state, int16_t *samples, int sample_limit, int pitch, uint8_t *data, int data_shift, int data_pitch, int filter_count, int shift_range) {
psx_audio_encoder_channel_state_t *state, psx_audio_encoder_channel_state_t proposed;
const int16_t *samples,
int sample_limit,
int pitch,
uint8_t *data,
int data_shift,
int data_pitch,
int filter_count,
int shift_range
) {
psx_audio_encoder_channel_state_t proposed;
int64_t best_mse = ((int64_t)1<<(int64_t)50); int64_t best_mse = ((int64_t)1<<(int64_t)50);
int best_filter = 0; int best_filter = 0;
int best_sample_shift = 0; int best_sample_shift = 0;
@ -190,13 +161,7 @@ static uint8_t encode(
best_filter, best_sample_shift, shift_range); best_filter, best_sample_shift, shift_range);
} }
static void encode_block_xa( static void encode_block_xa(int16_t *audio_samples, int audio_samples_limit, uint8_t *data, psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state) {
const int16_t *audio_samples,
int audio_samples_limit,
uint8_t *data,
psx_audio_xa_settings_t settings,
psx_audio_encoder_state_t *state
) {
if (settings.bits_per_sample == 4) { if (settings.bits_per_sample == 4) {
if (settings.stereo) { if (settings.stereo) {
data[0] = encode(&(state->left), audio_samples, audio_samples_limit, 2, data + 0x10, 0, 4, XA_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS); data[0] = encode(&(state->left), audio_samples, audio_samples_limit, 2, data + 0x10, 0, 4, XA_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS);
@ -240,17 +205,25 @@ uint32_t psx_audio_xa_get_buffer_size(psx_audio_xa_settings_t settings, int samp
} }
uint32_t psx_audio_spu_get_buffer_size(int sample_count) { uint32_t psx_audio_spu_get_buffer_size(int sample_count) {
return ((sample_count + PSX_AUDIO_SPU_SAMPLES_PER_BLOCK - 1) / PSX_AUDIO_SPU_SAMPLES_PER_BLOCK) << 4; return ((sample_count + 27) / 28) << 4;
} }
uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings) { uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings) {
return settings.format == PSX_AUDIO_XA_FORMAT_XA ? 2336 : 2352; return settings.format == PSX_AUDIO_XA_FORMAT_XA ? 2336 : 2352;
} }
uint32_t psx_audio_spu_get_buffer_size_per_block(void) {
return 16;
}
uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings) { uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings) {
return (((settings.bits_per_sample == 8) ? 112 : 224) >> (settings.stereo ? 1 : 0)) * 18; return (((settings.bits_per_sample == 8) ? 112 : 224) >> (settings.stereo ? 1 : 0)) * 18;
} }
uint32_t psx_audio_spu_get_samples_per_block(void) {
return 28;
}
uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings) { uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings) {
// 1/2 interleave for 37800 Hz 8-bit stereo at 1x speed // 1/2 interleave for 37800 Hz 8-bit stereo at 1x speed
int interleave = settings.stereo ? 2 : 4; int interleave = settings.stereo ? 2 : 4;
@ -259,60 +232,40 @@ uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings) {
return interleave; return interleave;
} }
static inline void psx_audio_xa_sync_subheader_copy(psx_cdrom_sector_mode2_t *buffer) { static void psx_audio_xa_encode_init_sector(uint8_t *buffer, psx_audio_xa_settings_t settings) {
memcpy(buffer->subheader + 1, buffer->subheader, sizeof(psx_cdrom_sector_xa_subheader_t)); if (settings.format == PSX_AUDIO_XA_FORMAT_XACD) {
memset(buffer, 0, 2352);
memset(buffer+0x001, 0xFF, 10);
buffer[0x00F] = 0x02;
} else {
memset(buffer + 0x10, 0, 2336);
}
buffer[0x010] = settings.file_number;
buffer[0x011] = settings.channel_number & 0x1F;
buffer[0x012] = 0x24 | 0x40;
buffer[0x013] =
(settings.stereo ? 1 : 0)
| (settings.frequency >= PSX_AUDIO_XA_FREQ_DOUBLE ? 0 : 4)
| (settings.bits_per_sample >= 8 ? 16 : 0);
memcpy(buffer + 0x014, buffer + 0x010, 4);
} }
static void psx_audio_xa_encode_init_sector(psx_cdrom_sector_mode2_t *buffer, int lba, psx_audio_xa_settings_t settings) { int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state, int16_t* samples, int sample_count, uint8_t *output) {
if (settings.format == PSX_AUDIO_XA_FORMAT_XACD)
psx_cdrom_init_sector((psx_cdrom_sector_t *)buffer, lba, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2);
buffer->subheader[0].file = settings.file_number;
buffer->subheader[0].channel = settings.channel_number & PSX_CDROM_SECTOR_XA_CHANNEL_MASK;
buffer->subheader[0].submode =
PSX_CDROM_SECTOR_XA_SUBMODE_AUDIO
| PSX_CDROM_SECTOR_XA_SUBMODE_FORM2
| PSX_CDROM_SECTOR_XA_SUBMODE_RT;
if (settings.stereo)
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_STEREO;
else
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_MONO;
if (settings.frequency == PSX_AUDIO_XA_FREQ_DOUBLE)
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_FREQ_DOUBLE;
else
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_FREQ_SINGLE;
if (settings.bits_per_sample == 8)
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_BITS_8;
else
buffer->subheader[0].coding |= PSX_CDROM_SECTOR_XA_CODING_BITS_4;
psx_audio_xa_sync_subheader_copy(buffer);
}
int psx_audio_xa_encode(
psx_audio_xa_settings_t settings,
psx_audio_encoder_state_t *state,
const int16_t *samples,
int sample_count,
int lba,
uint8_t *output
) {
int sample_jump = (settings.bits_per_sample == 8) ? 112 : 224; int sample_jump = (settings.bits_per_sample == 8) ? 112 : 224;
int i, j; int i, j;
int xa_sector_size = psx_audio_xa_get_buffer_size_per_sector(settings); int xa_sector_size = settings.format == PSX_AUDIO_XA_FORMAT_XA ? 2336 : 2352;
int xa_offset = PSX_CDROM_SECTOR_SIZE - xa_sector_size; int xa_offset = 2352 - xa_sector_size;
uint8_t init_sector = 1; uint8_t init_sector = 1;
if (settings.stereo) if (settings.stereo) { sample_count <<= 1; }
sample_count *= 2;
for (i = 0, j = 0; i < sample_count || ((j % 18) != 0); i += sample_jump, j++) { for (i = 0, j = 0; i < sample_count || ((j % 18) != 0); i += sample_jump, j++) {
psx_cdrom_sector_mode2_t *sector_data = (psx_cdrom_sector_mode2_t*) (output + ((j/18) * xa_sector_size) - xa_offset); uint8_t *sector_data = output + ((j/18) * xa_sector_size) - xa_offset;
uint8_t *block_data = sector_data->data + ((j%18) * 0x80); uint8_t *block_data = sector_data + 0x18 + ((j%18) * 0x80);
if (init_sector) { if (init_sector) {
psx_audio_xa_encode_init_sector(sector_data, lba, settings); psx_audio_xa_encode_init_sector(sector_data, settings);
init_sector = 0; init_sector = 0;
} }
@ -322,9 +275,8 @@ int psx_audio_xa_encode(
memcpy(block_data + 12, block_data + 8, 4); memcpy(block_data + 12, block_data + 8, 4);
if ((j+1)%18 == 0) { if ((j+1)%18 == 0) {
psx_cdrom_calculate_checksums((psx_cdrom_sector_t *)sector_data, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2); psx_cdrom_calculate_checksums(sector_data, PSX_CDROM_SECTOR_TYPE_MODE2_FORM2);
init_sector = 1; init_sector = 1;
lba++;
} }
} }
@ -333,41 +285,28 @@ int psx_audio_xa_encode(
void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length) { void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length) {
if (output_length >= 2336) { if (output_length >= 2336) {
psx_cdrom_sector_mode2_t *sector = (psx_cdrom_sector_mode2_t*) &output[output_length - PSX_CDROM_SECTOR_SIZE]; output[output_length - 2352 + 0x12] |= 0x80;
sector->subheader[0].submode |= PSX_CDROM_SECTOR_XA_SUBMODE_EOF; output[output_length - 2352 + 0x18] |= 0x80;
psx_audio_xa_sync_subheader_copy(sector);
} }
} }
int psx_audio_xa_encode_simple( int psx_audio_xa_encode_simple(psx_audio_xa_settings_t settings, int16_t* samples, int sample_count, uint8_t *output) {
psx_audio_xa_settings_t settings,
const int16_t *samples,
int sample_count,
int lba,
uint8_t *output
) {
psx_audio_encoder_state_t state; psx_audio_encoder_state_t state;
memset(&state, 0, sizeof(psx_audio_encoder_state_t)); memset(&state, 0, sizeof(psx_audio_encoder_state_t));
int length = psx_audio_xa_encode(settings, &state, samples, sample_count, lba, output); int length = psx_audio_xa_encode(settings, &state, samples, sample_count, output);
psx_audio_xa_encode_finalize(settings, output, length); psx_audio_xa_encode_finalize(settings, output, length);
return length; return length;
} }
int psx_audio_spu_encode( int psx_audio_spu_encode(psx_audio_encoder_channel_state_t *state, int16_t* samples, int sample_count, int pitch, uint8_t *output) {
psx_audio_encoder_channel_state_t *state, uint8_t prebuf[28];
const int16_t *samples,
int sample_count,
int pitch,
uint8_t *output
) {
uint8_t prebuf[PSX_AUDIO_SPU_SAMPLES_PER_BLOCK];
uint8_t *buffer = output; uint8_t *buffer = output;
for (int i = 0; i < sample_count; i += PSX_AUDIO_SPU_SAMPLES_PER_BLOCK, buffer += PSX_AUDIO_SPU_BLOCK_SIZE) { for (int i = 0; i < sample_count; i += 28, buffer += 16) {
buffer[0] = encode(state, samples + i * pitch, sample_count - i, pitch, prebuf, 0, 1, SPU_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS); buffer[0] = encode(state, samples + i * pitch, sample_count - i, pitch, prebuf, 0, 1, SPU_ADPCM_FILTER_COUNT, SHIFT_RANGE_4BPS);
buffer[1] = 0; buffer[1] = 0;
for (int j = 0; j < PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; j+=2) { for (int j = 0; j < 28; j+=2) {
buffer[2 + (j>>1)] = (prebuf[j] & 0x0F) | (prebuf[j+1] << 4); buffer[2 + (j>>1)] = (prebuf[j] & 0x0F) | (prebuf[j+1] << 4);
} }
} }
@ -375,29 +314,29 @@ int psx_audio_spu_encode(
return buffer - output; return buffer - output;
} }
int psx_audio_spu_encode_simple(const int16_t *samples, int sample_count, uint8_t *output, int loop_start) { int psx_audio_spu_encode_simple(int16_t* samples, int sample_count, uint8_t *output, int loop_start) {
psx_audio_encoder_channel_state_t state; psx_audio_encoder_channel_state_t state;
memset(&state, 0, sizeof(psx_audio_encoder_channel_state_t)); memset(&state, 0, sizeof(psx_audio_encoder_channel_state_t));
int length = psx_audio_spu_encode(&state, samples, sample_count, 1, output); int length = psx_audio_spu_encode(&state, samples, sample_count, 1, output);
if (length >= PSX_AUDIO_SPU_BLOCK_SIZE) { if (length >= 32) {
uint8_t *last_block = output + length - PSX_AUDIO_SPU_BLOCK_SIZE;
if (loop_start < 0) { if (loop_start < 0) {
last_block[1] |= PSX_AUDIO_SPU_LOOP_END; //output[1] = PSX_AUDIO_SPU_LOOP_START;
output[length - 16 + 1] = PSX_AUDIO_SPU_LOOP_END;
// Insert trailing looping block
memset(output + length, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
output[length + 1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
length += PSX_AUDIO_SPU_BLOCK_SIZE;
} else { } else {
int loop_start_offset = loop_start / PSX_AUDIO_SPU_SAMPLES_PER_BLOCK * PSX_AUDIO_SPU_BLOCK_SIZE; psx_audio_spu_set_flag_at_sample(output, loop_start, PSX_AUDIO_SPU_LOOP_START);
output[length - 16 + 1] = PSX_AUDIO_SPU_LOOP_REPEAT;
last_block[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
output[loop_start_offset + 1] |= PSX_AUDIO_SPU_LOOP_START;
} }
} else if (length >= 16) {
output[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
if (loop_start >= 0)
output[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
} }
return length; return length;
} }
void psx_audio_spu_set_flag_at_sample(uint8_t* spu_data, int sample_pos, int flag) {
int buffer_pos = (sample_pos / 28) << 4;
spu_data[buffer_pos + 1] = flag;
}

View File

@ -21,91 +21,49 @@ freely, subject to the following restrictions:
3. This notice may not be removed or altered from any source distribution. 3. This notice may not be removed or altered from any source distribution.
*/ */
#include <stdint.h>
#include <string.h> #include <string.h>
#include "libpsxav.h" #include "libpsxav.h"
#define EDC_CRC32_POLYNOMIAL 0xD8018001 static uint32_t psx_cdrom_calculate_edc(uint8_t *sector, uint32_t offset, uint32_t size)
{
static uint32_t edc_crc32(uint8_t *data, int length) {
uint32_t edc = 0; uint32_t edc = 0;
for (int i = offset; i < offset+size; i++) {
for (int i = 0; i < length; i++) { edc ^= 0xFF&(uint32_t)sector[i];
edc ^= 0xFF & (uint32_t)data[i]; for (int ibit = 0; ibit < 8; ibit++) {
edc = (edc>>1)^(0xD8018001*(edc&0x1));
for (int j = 0; j < 8; j++) }
edc = (edc >> 1) ^ (EDC_CRC32_POLYNOMIAL * (edc & 0x1));
} }
return edc; return edc;
} }
#define TO_BCD(x) ((x) + ((x) / 10) * 6) void psx_cdrom_calculate_checksums(uint8_t *sector, psx_cdrom_sector_type_t type)
{
void psx_cdrom_init_xa_subheader(psx_cdrom_sector_xa_subheader_t *subheader, psx_cdrom_sector_type_t type) {
memset(subheader, 0, sizeof(psx_cdrom_sector_xa_subheader_t) * 2);
subheader->submode = PSX_CDROM_SECTOR_XA_SUBMODE_DATA;
if (type == PSX_CDROM_SECTOR_TYPE_MODE2_FORM2)
subheader->submode |= PSX_CDROM_SECTOR_XA_SUBMODE_FORM2;
memcpy(subheader + 1, subheader, sizeof(psx_cdrom_sector_xa_subheader_t));
}
void psx_cdrom_init_sector(psx_cdrom_sector_t *sector, int lba, psx_cdrom_sector_type_t type) {
// Sync sequence
memset(sector->mode1.sync + 1, 0xff, 10);
sector->mode1.sync[0x0] = 0x00;
sector->mode1.sync[0xB] = 0x00;
// Timecode
lba += 150;
sector->mode1.header.minute = TO_BCD(lba / 4500);
sector->mode1.header.second = TO_BCD((lba / 75) % 60);
sector->mode1.header.sector = TO_BCD(lba % 75);
// Mode
if (type == PSX_CDROM_SECTOR_TYPE_MODE1) {
sector->mode1.header.mode = 0x01;
} else {
sector->mode2.header.mode = 0x02;
psx_cdrom_init_xa_subheader(sector->mode2.subheader, type);
}
}
void psx_cdrom_calculate_checksums(psx_cdrom_sector_t *sector, psx_cdrom_sector_type_t type) {
uint8_t *data = (uint8_t *)sector;
uint32_t edc;
switch (type) { switch (type) {
case PSX_CDROM_SECTOR_TYPE_MODE1: case PSX_CDROM_SECTOR_TYPE_MODE1: {
edc = edc_crc32(data, 0x810); uint32_t edc = psx_cdrom_calculate_edc(sector, 0x0, 0x810);
sector[0x810] = (uint8_t)(edc);
sector[0x811] = (uint8_t)(edc >> 8);
sector[0x812] = (uint8_t)(edc >> 16);
sector[0x813] = (uint8_t)(edc >> 24);
data[0x810] = (uint8_t)(edc);
data[0x811] = (uint8_t)(edc >> 8);
data[0x812] = (uint8_t)(edc >> 16);
data[0x813] = (uint8_t)(edc >> 24);
memset(sector + 0x814, 0, 8); memset(sector + 0x814, 0, 8);
// TODO: ECC // TODO: ECC
break; } break;
case PSX_CDROM_SECTOR_TYPE_MODE2_FORM1: {
uint32_t edc = psx_cdrom_calculate_edc(sector, 0x10, 0x808);
sector[0x818] = (uint8_t)(edc);
sector[0x819] = (uint8_t)(edc >> 8);
sector[0x81A] = (uint8_t)(edc >> 16);
sector[0x81B] = (uint8_t)(edc >> 24);
case PSX_CDROM_SECTOR_TYPE_MODE2_FORM1:
edc = edc_crc32(data + 0x10, 0x808);
data[0x818] = (uint8_t)(edc);
data[0x819] = (uint8_t)(edc >> 8);
data[0x81A] = (uint8_t)(edc >> 16);
data[0x81B] = (uint8_t)(edc >> 24);
// TODO: ECC // TODO: ECC
break; } break;
case PSX_CDROM_SECTOR_TYPE_MODE2_FORM2: {
case PSX_CDROM_SECTOR_TYPE_MODE2_FORM2: uint32_t edc = psx_cdrom_calculate_edc(sector, 0x10, 0x91C);
edc = edc_crc32(data + 0x10, 0x91C); sector[0x92C] = (uint8_t)(edc);
sector[0x92D] = (uint8_t)(edc >> 8);
data[0x92C] = (uint8_t)(edc); sector[0x92E] = (uint8_t)(edc >> 16);
data[0x92D] = (uint8_t)(edc >> 8); sector[0x92F] = (uint8_t)(edc >> 24);
data[0x92E] = (uint8_t)(edc >> 16); } break;
data[0x92F] = (uint8_t)(edc >> 24);
break;
} }
} }

View File

@ -21,20 +21,16 @@ freely, subject to the following restrictions:
3. This notice may not be removed or altered from any source distribution. 3. This notice may not be removed or altered from any source distribution.
*/ */
#pragma once #ifndef __LIBPSXAV_H__
#define __LIBPSXAV_H__
#include <stdbool.h> #include <stdbool.h>
#include <stdint.h> #include <stdint.h>
// audio.c // audio.c
#define PSX_AUDIO_SPU_BLOCK_SIZE 16 #define PSX_AUDIO_XA_FREQ_SINGLE 18900
#define PSX_AUDIO_SPU_SAMPLES_PER_BLOCK 28 #define PSX_AUDIO_XA_FREQ_DOUBLE 37800
enum {
PSX_AUDIO_XA_FREQ_SINGLE = 18900,
PSX_AUDIO_XA_FREQ_DOUBLE = 37800
};
typedef enum { typedef enum {
PSX_AUDIO_XA_FORMAT_XA, // .xa file PSX_AUDIO_XA_FORMAT_XA, // .xa file
@ -61,113 +57,34 @@ typedef struct {
psx_audio_encoder_channel_state_t right; psx_audio_encoder_channel_state_t right;
} psx_audio_encoder_state_t; } psx_audio_encoder_state_t;
enum { #define PSX_AUDIO_SPU_LOOP_END 1
PSX_AUDIO_SPU_LOOP_END = 1 << 0, #define PSX_AUDIO_SPU_LOOP_REPEAT 3
PSX_AUDIO_SPU_LOOP_REPEAT = 3 << 0, #define PSX_AUDIO_SPU_LOOP_START 4
PSX_AUDIO_SPU_LOOP_START = 1 << 2
};
uint32_t psx_audio_xa_get_buffer_size(psx_audio_xa_settings_t settings, int sample_count); uint32_t psx_audio_xa_get_buffer_size(psx_audio_xa_settings_t settings, int sample_count);
uint32_t psx_audio_spu_get_buffer_size(int sample_count); uint32_t psx_audio_spu_get_buffer_size(int sample_count);
uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings); uint32_t psx_audio_xa_get_buffer_size_per_sector(psx_audio_xa_settings_t settings);
uint32_t psx_audio_spu_get_buffer_size_per_block(void);
uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings); uint32_t psx_audio_xa_get_samples_per_sector(psx_audio_xa_settings_t settings);
uint32_t psx_audio_spu_get_samples_per_block(void);
uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings); uint32_t psx_audio_xa_get_sector_interleave(psx_audio_xa_settings_t settings);
int psx_audio_xa_encode( int psx_audio_xa_encode(psx_audio_xa_settings_t settings, psx_audio_encoder_state_t *state, int16_t* samples, int sample_count, uint8_t *output);
psx_audio_xa_settings_t settings, int psx_audio_xa_encode_simple(psx_audio_xa_settings_t settings, int16_t* samples, int sample_count, uint8_t *output);
psx_audio_encoder_state_t *state, int psx_audio_spu_encode(psx_audio_encoder_channel_state_t *state, int16_t* samples, int sample_count, int pitch, uint8_t *output);
const int16_t *samples, int psx_audio_spu_encode_simple(int16_t* samples, int sample_count, uint8_t *output, int loop_start);
int sample_count,
int lba,
uint8_t *output
);
int psx_audio_xa_encode_simple(
psx_audio_xa_settings_t settings,
const int16_t *samples,
int sample_count,
int lba,
uint8_t *output
);
int psx_audio_spu_encode(
psx_audio_encoder_channel_state_t *state,
const int16_t *samples,
int sample_count,
int pitch,
uint8_t *output
);
int psx_audio_spu_encode_simple(const int16_t *samples, int sample_count, uint8_t *output, int loop_start);
void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length); void psx_audio_xa_encode_finalize(psx_audio_xa_settings_t settings, uint8_t *output, int output_length);
void psx_audio_spu_set_flag_at_sample(uint8_t* spu_data, int sample_pos, int flag);
// cdrom.c // cdrom.c
#define PSX_CDROM_SECTOR_SIZE 2352 #define PSX_CDROM_SECTOR_SIZE 2352
typedef struct {
uint8_t minute;
uint8_t second;
uint8_t sector;
uint8_t mode;
} psx_cdrom_sector_header_t;
typedef struct {
uint8_t file;
uint8_t channel;
uint8_t submode;
uint8_t coding;
} psx_cdrom_sector_xa_subheader_t;
typedef struct {
uint8_t sync[12];
psx_cdrom_sector_header_t header;
uint8_t data[0x920];
} psx_cdrom_sector_mode1_t;
typedef struct {
uint8_t sync[12];
psx_cdrom_sector_header_t header;
psx_cdrom_sector_xa_subheader_t subheader[2];
uint8_t data[0x918];
} psx_cdrom_sector_mode2_t;
typedef union {
psx_cdrom_sector_mode1_t mode1;
psx_cdrom_sector_mode2_t mode2;
} psx_cdrom_sector_t;
_Static_assert(sizeof(psx_cdrom_sector_mode1_t) == PSX_CDROM_SECTOR_SIZE, "Invalid Mode1 sector size");
_Static_assert(sizeof(psx_cdrom_sector_mode2_t) == PSX_CDROM_SECTOR_SIZE, "Invalid Mode2 sector size");
#define PSX_CDROM_SECTOR_XA_CHANNEL_MASK 0x1F
enum {
PSX_CDROM_SECTOR_XA_SUBMODE_EOR = 1 << 0,
PSX_CDROM_SECTOR_XA_SUBMODE_VIDEO = 1 << 1,
PSX_CDROM_SECTOR_XA_SUBMODE_AUDIO = 1 << 2,
PSX_CDROM_SECTOR_XA_SUBMODE_DATA = 1 << 3,
PSX_CDROM_SECTOR_XA_SUBMODE_TRIGGER = 1 << 4,
PSX_CDROM_SECTOR_XA_SUBMODE_FORM2 = 1 << 5,
PSX_CDROM_SECTOR_XA_SUBMODE_RT = 1 << 6,
PSX_CDROM_SECTOR_XA_SUBMODE_EOF = 1 << 7
};
enum {
PSX_CDROM_SECTOR_XA_CODING_MONO = 0 << 0,
PSX_CDROM_SECTOR_XA_CODING_STEREO = 1 << 0,
PSX_CDROM_SECTOR_XA_CODING_CHANNEL_MASK = 3 << 0,
PSX_CDROM_SECTOR_XA_CODING_FREQ_DOUBLE = 0 << 2,
PSX_CDROM_SECTOR_XA_CODING_FREQ_SINGLE = 1 << 2,
PSX_CDROM_SECTOR_XA_CODING_FREQ_MASK = 3 << 2,
PSX_CDROM_SECTOR_XA_CODING_BITS_4 = 0 << 4,
PSX_CDROM_SECTOR_XA_CODING_BITS_8 = 1 << 4,
PSX_CDROM_SECTOR_XA_CODING_BITS_MASK = 3 << 4,
PSX_CDROM_SECTOR_XA_CODING_EMPHASIS = 1 << 6
};
typedef enum { typedef enum {
PSX_CDROM_SECTOR_TYPE_MODE1, PSX_CDROM_SECTOR_TYPE_MODE1,
PSX_CDROM_SECTOR_TYPE_MODE2_FORM1, PSX_CDROM_SECTOR_TYPE_MODE2_FORM1,
PSX_CDROM_SECTOR_TYPE_MODE2_FORM2 PSX_CDROM_SECTOR_TYPE_MODE2_FORM2
} psx_cdrom_sector_type_t; } psx_cdrom_sector_type_t;
void psx_cdrom_init_xa_subheader(psx_cdrom_sector_xa_subheader_t *subheader, psx_cdrom_sector_type_t type); void psx_cdrom_calculate_checksums(uint8_t *sector, psx_cdrom_sector_type_t type);
void psx_cdrom_init_sector(psx_cdrom_sector_t *sector, int lba, psx_cdrom_sector_type_t type);
void psx_cdrom_calculate_checksums(psx_cdrom_sector_t *sector, psx_cdrom_sector_type_t type); #endif /* __LIBPSXAV_H__ */

View File

@ -1,32 +1,28 @@
project('psxavenc', 'c', default_options: ['c_std=c11']) project('psxavenc', 'c', default_options: ['c_std=c11'])
add_project_arguments('-D_POSIX_C_SOURCE=201112L', '-ffast-math', language : 'c') add_project_arguments('-D_POSIX_C_SOURCE=201112L', language : 'c')
conf_data = configuration_data()
conf_data.set('VERSION', '"' + run_command('git', '-C', meson.project_source_root(), 'describe', '--tags', '--always', '--dirty', '--match=v*', check: true).stdout().strip() + '"')
configure_file(output: 'config.h', configuration: conf_data)
libm_dep = meson.get_compiler('c').find_library('m') libm_dep = meson.get_compiler('c').find_library('m')
ffmpeg = [ ffmpeg = [
dependency('libavformat'), dependency('libavformat'),
dependency('libavcodec'), dependency('libavcodec'),
dependency('libavutil'), dependency('libavutil'),
dependency('libswresample'), dependency('libswresample'),
dependency('libswscale') dependency('libswscale')
] ]
libpsxav = static_library('psxav', [ libpsxav = static_library('psxav', [
'libpsxav/adpcm.c', 'libpsxav/adpcm.c',
'libpsxav/cdrom.c', 'libpsxav/cdrom.c',
'libpsxav/libpsxav.h' 'libpsxav/libpsxav.h'
]) ])
libpsxav_dep = declare_dependency(include_directories: include_directories('libpsxav'), link_with: libpsxav) libpsxav_dep = declare_dependency(include_directories: include_directories('libpsxav'), link_with: libpsxav)
executable('psxavenc', [ executable('psxavenc', [
'psxavenc/args.c', 'psxavenc/cdrom.c',
'psxavenc/decoding.c', 'psxavenc/decoding.c',
'psxavenc/filefmt.c', 'psxavenc/filefmt.c',
'psxavenc/main.c', 'psxavenc/mdec.c',
'psxavenc/mdec.c' 'psxavenc/psxavenc.c'
], dependencies: [libm_dep, ffmpeg, libpsxav_dep], install: true) ], dependencies: [libm_dep, ffmpeg, libpsxav_dep], install: true)

View File

@ -1,722 +0,0 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "args.h"
#include "config.h"
#define INVALID_PARAM -1
static int parse_int(
int *output,
const char *name,
const char *value,
int min_value,
int max_value
) {
if (value == NULL) {
fprintf(stderr, "Missing %s value after option\n", name);
return INVALID_PARAM;
}
*output = strtol(value, NULL, 0);
if (
(*output < min_value) ||
(max_value >= 0 && *output > max_value)
) {
if (max_value >= 0)
fprintf(stderr, "Invalid %s: %d (must be in %d-%d range)\n", name, *output, min_value, max_value);
else
fprintf(stderr, "Invalid %s: %d (must be %d or greater)\n", name, *output, min_value);
return INVALID_PARAM;
}
return 2;
}
static int parse_int_one_of(
int *output,
const char *name,
const char *value,
int value_a,
int value_b
) {
if (value == NULL) {
fprintf(stderr, "Missing %s value after option\n", name);
return INVALID_PARAM;
}
*output = strtol(value, NULL, 0);
if (*output != value_a && *output != value_b) {
fprintf(stderr, "Invalid %s: %d (must be %d or %d)\n", name, *output, value_a, value_b);
return INVALID_PARAM;
}
return 2;
}
static int parse_enum(
int *output,
const char *name,
const char *value,
const char *const *choices,
int count
) {
if (value == NULL) {
fprintf(stderr, "Missing %s value after option\n", name);
return INVALID_PARAM;
}
for (int i = 0; i < count; i++) {
if (strcmp(value, choices[i]) == 0) {
*output = i;
return 2;
}
}
fprintf(
stderr,
"Invalid %s: %s\n"
"Must be one of the following values:\n",
name,
value
);
for (int i = 0; i < count; i++)
fprintf(stderr, " %s\n", choices[i]);
return INVALID_PARAM;
}
static const char *const general_options_help =
"General options:\n"
" -h Show this help message and exit\n"
" -V Show version information and exit\n"
" -q Suppress all non-error messages\n"
" -t format Use (or show help for) specified output format\n"
" xa: [A.] XA-ADPCM, 2336-byte sectors\n"
" xacd: [A.] XA-ADPCM, 2352-byte sectors\n"
" spu: [A.] raw SPU-ADPCM mono data\n"
" spui: [A.] raw SPU-ADPCM interleaved data\n"
" vag: [A.] .vag SPU-ADPCM mono\n"
" vagi: [A.] .vag SPU-ADPCM interleaved\n"
" str: [AV] .str video + XA-ADPCM, 2336-byte sectors\n"
" strcd: [AV] .str video + XA-ADPCM, 2352-byte sectors\n"
//" strspu: [AV] .str video + SPU-ADPCM, 2048-byte sectors\n"
" strv: [.V] .str video, 2048-byte sectors\n"
" sbs: [.V] .sbs video\n"
" -R key=value,... Pass custom options to libswresample (see FFmpeg docs)\n"
" -S key=value,... Pass custom options to libswscale (see FFmpeg docs)\n"
"\n";
static const char *const format_names[NUM_FORMATS] = {
"xa",
"xacd",
"spu",
"vag",
"spui",
"vagi",
"str",
"strcd",
"strspu",
"strv",
"sbs"
};
static void init_default_args(args_t *args) {
if (
args->format == FORMAT_XA ||
args->format == FORMAT_XACD ||
args->format == FORMAT_STR ||
args->format == FORMAT_STRCD
)
args->audio_frequency = 37800;
else
args->audio_frequency = 44100;
if (args->format == FORMAT_SPU || args->format == FORMAT_VAG)
args->audio_channels = 1;
else
args->audio_channels = 2;
args->audio_bit_depth = 4;
args->audio_xa_file = 0;
args->audio_xa_channel = 0;
args->audio_interleave = 2048;
args->audio_loop_point = -1;
args->video_codec = BS_CODEC_V2;
args->video_width = 320;
args->video_height = 240;
args->str_fps_num = 15;
args->str_fps_den = 1;
args->str_cd_speed = 2;
args->str_video_id = 0x8001;
args->str_audio_id = 0x0001;
if (args->format == FORMAT_SPU || args->format == FORMAT_VAG)
args->alignment = 64; // Default SPU DMA chunk size
else if (args->format == FORMAT_SBS)
args->alignment = 8192; // Default for System 573 games
else
args->alignment = 2048;
}
static int parse_general_option(args_t *args, char option, const char *param) {
int parsed;
switch (option) {
case '-':
args->flags |= FLAG_IGNORE_OPTIONS;
return 1;
case 'h':
args->flags |= FLAG_PRINT_HELP;
return 1;
case 'V':
args->flags |= FLAG_PRINT_VERSION;
return 1;
case 'q':
args->flags |= FLAG_QUIET | FLAG_HIDE_PROGRESS;
return 1;
case 't':
parsed = parse_enum(&(args->format), "format", param, format_names, NUM_FORMATS);
if (parsed > 0)
init_default_args(args);
return parsed;
case 'R':
if (param == NULL) {
fprintf(stderr, "Missing libswresample parameter list after option\n");
return INVALID_PARAM;
}
args->swresample_options = param;
return 2;
case 'S':
if (param == NULL) {
fprintf(stderr, "Missing libswscale parameter list after option\n");
return INVALID_PARAM;
}
args->swscale_options = param;
return 2;
default:
return 0;
}
}
static const char *const xa_options_help =
"XA-ADPCM options:\n"
" [-f 18900|37800] [-c 1|2] [-b 4|8] [-F 0-255] [-C 0-31]\n"
"\n"
" -f 18900|37800 Use specified sample rate (default 37800)\n"
" -c 1|2 Use specified channel count (default 2)\n"
" -b 4|8 Use specified bit depth (default 4)\n"
" -F 0-255 Set CD-XA file number (for both audio and video, default 0)\n"
" -C 0-31 Set CD-XA channel number (for both audio and video, default 0)\n"
"\n";
static int parse_xa_option(args_t *args, char option, const char *param) {
switch (option) {
case 'f':
return parse_int_one_of(&(args->audio_frequency), "sample rate", param, 18900, 37800);
case 'c':
return parse_int_one_of(&(args->audio_channels), "channel count", param, 1, 2);
case 'b':
return parse_int_one_of(&(args->audio_bit_depth), "bit depth", param, 4, 8);
case 'F':
return parse_int(&(args->audio_xa_file), "file number", param, 0, 255);
case 'C':
return parse_int(&(args->audio_xa_channel), "channel number", param, 0, 31);
default:
return 0;
}
}
static const char *const spu_options_help =
"Mono SPU-ADPCM options:\n"
" [-f freq] [-a size] [-l ms | -L] [-D]\n"
"\n"
" -f freq Use specified sample rate (default 44100)\n"
" -a size Pad audio data excluding header to multiple of given size (default 64)\n"
" -l ms Add loop point at specified offset (in milliseconds)\n"
" -L Set loop end flag at the end of data but do not add a loop point\n"
" -D Do not prepend encoded data with a dummy silent block\n"
"\n";
static int parse_spu_option(args_t *args, char option, const char *param) {
switch (option) {
case 'f':
return parse_int(&(args->audio_frequency), "sample rate", param, 1, -1);
case 'a':
return parse_int(&(args->alignment), "alignment", param, 1, -1);
case 'l':
args->flags |= FLAG_SPU_LOOP_END;
return parse_int(&(args->audio_loop_point), "loop offset", param, 0, -1);
case 'L':
args->flags |= FLAG_SPU_LOOP_END;
return 1;
case 'D':
args->flags |= FLAG_SPU_NO_LEADING_DUMMY;
return 1;
default:
return 0;
}
}
static const char *const spui_options_help =
"Interleaved SPU-ADPCM options:\n"
" [-f freq] [-c channels] [-i size] [-a size] [-L] [-D]\n"
"\n"
" -f freq Use specified sample rate (default 44100)\n"
" -c channels Use specified channel count (default 2)\n"
" -i size Use specified channel interleave size (default 2048)\n"
" -a size Pad .vag header and each audio chunk to multiples of given size\n"
" (default 2048)\n"
" -L Set loop end flag at the end of each audio chunk\n"
" -D Do not prepend first chunk's data with a dummy silent block\n"
"\n";
static int parse_spui_option(args_t *args, char option, const char *param) {
int parsed;
switch (option) {
case 'f':
return parse_int(&(args->audio_frequency), "sample rate", param, 1, -1);
case 'c':
return parse_int(&(args->audio_channels), "channel count", param, 1, -1);
case 'i':
parsed = parse_int(&(args->audio_interleave), "interleave", param, 16, -1);
// Round up to nearest multiple of 16
args->audio_interleave = (args->audio_interleave + 15) & ~15;
return parsed;
case 'a':
return parse_int(&(args->alignment), "alignment", param, 1, -1);
case 'L':
args->flags |= FLAG_SPU_LOOP_END;
return 1;
case 'D':
args->flags |= FLAG_SPU_NO_LEADING_DUMMY;
return 1;
default:
return 0;
}
}
static const char *const bs_options_help =
"Video options:\n"
" [-v v2|v3|v3dc] [-s WxH] [-I]\n"
"\n"
" -v codec Use specified video codec\n"
" v2: MDEC BS v2 (default)\n"
" v3: MDEC BS v3\n"
" v3dc: MDEC BS v3, expect decoder to wrap DC coefficients\n"
" -s WxH Rescale input file to fit within specified size\n"
" (16x16-640x512 in 16-pixel increments, default 320x240)\n"
" -I Force stretching to given size without preserving aspect ratio\n"
"\n";
const char *const bs_codec_names[NUM_BS_CODECS] = {
"v2",
"v3",
"v3dc"
};
static int parse_bs_option(args_t *args, char option, const char *param) {
char *next = NULL;
switch (option) {
case 'v':
return parse_enum(&(args->video_codec), "video codec", param, bs_codec_names, NUM_BS_CODECS);
case 's':
if (param == NULL) {
fprintf(stderr, "Missing video size after option\n");
return INVALID_PARAM;
}
args->video_width = strtol(param, &next, 10);
if (next && *next == 'x') {
args->video_height = strtol(next + 1, NULL, 10);
} else {
fprintf(stderr, "Invalid video size (must be specified as <width>x<height>)\n");
return INVALID_PARAM;
}
if (args->video_width < 16 || args->video_width > 640) {
fprintf(stderr, "Invalid video width: %d (must be in 16-640 range)\n", args->video_width);
return INVALID_PARAM;
}
if (args->video_height < 16 || args->video_height > 512) {
fprintf(stderr, "Invalid video height: %d (must be in 16-512 range)\n", args->video_height);
return INVALID_PARAM;
}
// Round up to nearest multiples of 16
args->video_width = (args->video_width + 15) & ~15;
args->video_height = (args->video_height + 15) & ~15;
return 2;
case 'I':
args->flags |= FLAG_BS_IGNORE_ASPECT;
return 1;
default:
return 0;
}
}
static const char *const str_options_help =
".str container options:\n"
" [-r num[/den]] [-x 1|2] [-T id] [-A id] [-X]\n"
"\n"
" -r num[/den] Set video frame rate to specified integer or fraction (default 15)\n"
" -x 1|2 Set CD-ROM speed the file is meant to played at (default 2)\n"
" -T id Tag video sectors with specified .str type ID (default 0x8001)\n"
" -A id Tag SPU-ADPCM sectors with specified .str type ID (default 0x0001)\n"
" -X Place audio sectors after corresponding video sectors\n"
" (rather than ahead of them)\n"
"\n";
static int parse_str_option(args_t *args, char option, const char *param) {
char *next = NULL;
int fps;
switch (option) {
case 'r':
if (param == NULL) {
fprintf(stderr, "Missing frame rate value after option\n");
return INVALID_PARAM;
}
args->str_fps_num = strtol(param, &next, 10);
if (next && *next == '/')
args->str_fps_den = strtol(next + 1, NULL, 10);
else
args->str_fps_den = 1;
if (args->str_fps_num <= 0 || args->str_fps_den <= 0) {
fprintf(stderr, "Invalid frame rate (must be a non-zero integer or fraction)\n");
return INVALID_PARAM;
}
fps = args->str_fps_num / args->str_fps_den;
if (fps < 1 || fps > 60) {
fprintf(stderr, "Invalid frame rate: %d/%d (must be in 1-60 range)\n", args->str_fps_num, args->str_fps_den);
return INVALID_PARAM;
}
return 2;
case 'x':
return parse_int_one_of(&(args->str_cd_speed), "CD-ROM speed", param, 1, 2);
case 'T':
return parse_int(&(args->str_video_id), "video track type ID", param, 0x0000, 0xFFFF);
case 'A':
return parse_int(&(args->str_audio_id), "audio track type ID", param, 0x0000, 0xFFFF);
case 'X':
args->flags |= FLAG_STR_TRAILING_AUDIO;
return 1;
default:
return 0;
}
}
static const char *const sbs_options_help =
".sbs container options:\n"
" [-a size]\n"
"\n"
" -a size Set size of each video frame (default 8192)\n"
"\n";
static int parse_sbs_option(args_t *args, char option, const char *param) {
switch (option) {
case 'a':
return parse_int(&(args->alignment), "video frame size", param, 256, -1);
default:
return 0;
}
}
static const char *const general_usage =
"Usage:\n"
" psxavenc -t xa|xacd [xa-options] <in> <out.xa>\n"
" psxavenc -t spu|vag [spu-options] <in> <out.vag>\n"
" psxavenc -t spui|vagi [spui-options] <in> <out.vag>\n"
" psxavenc -t str|strcd [xa-options] [bs-options] [str-options] <in> <out.str>\n"
//" psxavenc -t strspu [spui-options] [bs-options] [str-options] <in> <out.str>\n"
" psxavenc -t strv [bs-options] [str-options] <in> <out.str>\n"
" psxavenc -t sbs [bs-options] [sbs-options] <in> <out.sbs>\n"
"\n";
static const struct {
const char *usage;
const char *audio_options_help;
const char *video_options_help;
const char *container_options_help;
int (*parse_audio_option)(args_t *, char, const char *);
int (*parse_video_option)(args_t *, char, const char *);
int (*parse_container_option)(args_t *, char, const char *);
} format_info[NUM_FORMATS] = {
{
.usage = "psxavenc -t xa [xa-options] <in> <out.xa>",
.audio_options_help = xa_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_xa_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t xacd [xa-options] <in> <out.xa>",
.audio_options_help = xa_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_xa_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t spu [spu-options] <in> <out>",
.audio_options_help = spu_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_spu_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t vag [spu-options] <in> <out.vag>",
.audio_options_help = spu_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_spu_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t spui [spui-options] <in> <out>",
.audio_options_help = spui_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_spui_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t vagi [spui-options] <in> <out.vag>",
.audio_options_help = spui_options_help,
.video_options_help = NULL,
.container_options_help = NULL,
.parse_audio_option = parse_spui_option,
.parse_video_option = NULL,
.parse_container_option = NULL
}, {
.usage = "psxavenc -t str [xa-options] [bs-options] [str-options] <in> <out.str>",
.audio_options_help = xa_options_help,
.video_options_help = bs_options_help,
.container_options_help = str_options_help,
.parse_audio_option = parse_xa_option,
.parse_video_option = parse_bs_option,
.parse_container_option = parse_str_option
}, {
.usage = "psxavenc -t strcd [xa-options] [bs-options] [str-options] <in> <out.str>",
.audio_options_help = xa_options_help,
.video_options_help = bs_options_help,
.container_options_help = str_options_help,
.parse_audio_option = parse_xa_option,
.parse_video_option = parse_bs_option,
.parse_container_option = parse_str_option
}, {
.usage = "psxavenc -t strspu [spui-options] [bs-options] [str-options] <in> <out.str>",
.audio_options_help = spui_options_help,
.video_options_help = bs_options_help,
.container_options_help = str_options_help,
.parse_audio_option = parse_spui_option,
.parse_video_option = parse_bs_option,
.parse_container_option = parse_str_option
}, {
.usage = "psxavenc -t strv [bs-options] [str-options] <in> <out.str>",
.audio_options_help = NULL,
.video_options_help = bs_options_help,
.container_options_help = str_options_help,
.parse_audio_option = NULL,
.parse_video_option = parse_bs_option,
.parse_container_option = parse_str_option
}, {
.usage = "psxavenc -t sbs [bs-options] [sbs-options] <in> <out.sbs>",
.audio_options_help = NULL,
.video_options_help = bs_options_help,
.container_options_help = sbs_options_help,
.parse_audio_option = NULL,
.parse_video_option = parse_bs_option,
.parse_container_option = parse_sbs_option
}
};
static int parse_option(args_t *args, char option, const char *param) {
int parsed = parse_general_option(args, option, param);
if (parsed == 0 && args->format != FORMAT_INVALID) {
if (format_info[args->format].parse_audio_option != NULL)
parsed = format_info[args->format].parse_audio_option(args, option, param);
}
if (parsed == 0 && args->format != FORMAT_INVALID) {
if (format_info[args->format].parse_video_option != NULL)
parsed = format_info[args->format].parse_video_option(args, option, param);
}
if (parsed == 0 && args->format != FORMAT_INVALID) {
if (format_info[args->format].parse_container_option != NULL)
parsed = format_info[args->format].parse_container_option(args, option, param);
}
if (parsed == 0) {
if (args->format == FORMAT_INVALID)
fprintf(
stderr,
"Unknown general option: -%c\n"
"(if this is a format-specific option, it shall be passed after -t)\n",
option
);
else
fprintf(stderr, "Unknown option for format %s: -%c\n", format_names[args->format], option);
}
return parsed;
}
static void print_help(format_t format) {
if (format == FORMAT_INVALID) {
printf(
"%s%s%s%s%s%s%s%s",
general_usage,
general_options_help,
xa_options_help,
spu_options_help,
spui_options_help,
bs_options_help,
str_options_help,
sbs_options_help
);
return;
}
printf(
"Usage:\n"
" %s\n"
"\n"
"%s",
format_info[format].usage,
general_options_help
);
if (format_info[format].audio_options_help != NULL)
printf("%s", format_info[format].audio_options_help);
if (format_info[format].video_options_help != NULL)
printf("%s", format_info[format].video_options_help);
if (format_info[format].container_options_help != NULL)
printf("%s", format_info[format].container_options_help);
}
bool parse_args(args_t *args, const char *const *options, int count) {
int arg_index = 0;
while (arg_index < count) {
const char *option = options[arg_index];
if (option[0] == '-' && option[2] == 0 && !(args->flags & FLAG_IGNORE_OPTIONS)) {
const char *param;
if ((arg_index + 1) < count)
param = options[arg_index + 1];
else
param = NULL;
int parsed = parse_option(args, option[1], param);
if (parsed <= 0)
return false;
arg_index += parsed;
continue;
}
if (args->input_file == NULL) {
args->input_file = option;
} else if (args->output_file == NULL) {
args->output_file = option;
} else {
fprintf(stderr, "There should be no arguments after the output file path\n");
return false;
}
arg_index++;
}
if (args->flags & FLAG_PRINT_HELP) {
print_help(args->format);
return false;
}
if (args->flags & FLAG_PRINT_VERSION) {
printf("psxavenc " VERSION "\n");
return false;
}
if (args->format == FORMAT_INVALID || args->input_file == NULL || args->output_file == NULL) {
fprintf(
stderr,
"%s"
"For more information about the options supported for a given output format, run:\n"
" psxavenc -t <format> -h\n"
"To view the full list of supported options, run:\n"
" psxavenc -h\n",
general_usage
);
return false;
}
return true;
}

View File

@ -1,95 +0,0 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#pragma once
#include <stdbool.h>
#define NUM_FORMATS 11
#define NUM_BS_CODECS 3
enum {
FLAG_IGNORE_OPTIONS = 1 << 0,
FLAG_QUIET = 1 << 1,
FLAG_HIDE_PROGRESS = 1 << 2,
FLAG_PRINT_HELP = 1 << 3,
FLAG_PRINT_VERSION = 1 << 4,
FLAG_SPU_LOOP_END = 1 << 5,
FLAG_SPU_NO_LEADING_DUMMY = 1 << 6,
FLAG_BS_IGNORE_ASPECT = 1 << 7,
FLAG_STR_TRAILING_AUDIO = 1 << 8
};
typedef enum {
FORMAT_INVALID = -1,
FORMAT_XA,
FORMAT_XACD,
FORMAT_SPU,
FORMAT_VAG,
FORMAT_SPUI,
FORMAT_VAGI,
FORMAT_STR,
FORMAT_STRCD,
FORMAT_STRSPU,
FORMAT_STRV,
FORMAT_SBS
} format_t;
typedef enum {
BS_CODEC_INVALID = -1,
BS_CODEC_V2,
BS_CODEC_V3,
BS_CODEC_V3DC
} bs_codec_t;
typedef struct {
int flags;
format_t format;
const char *input_file;
const char *output_file;
const char *swresample_options;
const char *swscale_options;
int audio_frequency; // 18900 or 37800 Hz
int audio_channels;
int audio_bit_depth; // 4 or 8
int audio_xa_file; // 00-FF
int audio_xa_channel; // 00-1F
int audio_interleave;
int audio_loop_point;
bs_codec_t video_codec;
int video_width;
int video_height;
int str_fps_num;
int str_fps_den;
int str_cd_speed; // 1 or 2
int str_video_id;
int str_audio_id;
int alignment;
} args_t;
bool parse_args(args_t *args, const char *const *options, int count);

View File

@ -3,7 +3,6 @@ psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg
This software is provided 'as-is', without any express or implied This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages warranty. In no event will the authors be held liable for any damages
@ -22,15 +21,40 @@ freely, subject to the following restrictions:
3. This notice may not be removed or altered from any source distribution. 3. This notice may not be removed or altered from any source distribution.
*/ */
#pragma once #include "common.h"
#include <stdio.h> void init_sector_buffer_video(uint8_t *buffer, settings_t *settings) {
#include "args.h" int offset;
#include "decoding.h" if (settings->format == FORMAT_STR2CD) {
memset(buffer, 0, 2352);
memset(buffer+0x001, 0xFF, 10);
buffer[0x00F] = 0x02;
offset = 0x10;
} else {
memset(buffer, 0, 2336);
offset = 0;
}
void encode_file_xa(const args_t *args, decoder_t *decoder, FILE *output); buffer[offset+0] = settings->file_number;
void encode_file_spu(const args_t *args, decoder_t *decoder, FILE *output); buffer[offset+1] = settings->channel_number & 0x1F;
void encode_file_spui(const args_t *args, decoder_t *decoder, FILE *output); buffer[offset+2] = 0x08 | 0x40;
void encode_file_str(const args_t *args, decoder_t *decoder, FILE *output); buffer[offset+3] = 0x00;
void encode_file_strspu(const args_t *args, decoder_t *decoder, FILE *output); memcpy(buffer + offset + 4, buffer + offset, 4);
void encode_file_sbs(const args_t *args, decoder_t *decoder, FILE *output); }
void calculate_edc_data(uint8_t *buffer)
{
uint32_t edc = 0;
for (int i = 0x010; i < 0x818; i++) {
edc ^= 0xFF&(uint32_t)buffer[i];
for (int ibit = 0; ibit < 8; ibit++) {
edc = (edc>>1)^(0xD8018001*(edc&0x1));
}
}
buffer[0x818] = (uint8_t)(edc);
buffer[0x819] = (uint8_t)(edc >> 8);
buffer[0x81A] = (uint8_t)(edc >> 16);
buffer[0x81B] = (uint8_t)(edc >> 24);
// TODO: ECC
}

146
psxavenc/common.h Normal file
View File

@ -0,0 +1,146 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#include <assert.h>
#include <getopt.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>
#include <unistd.h>
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libswresample/swresample.h>
#include <libpsxav.h>
#define NUM_FORMATS 9
#define FORMAT_XA 0
#define FORMAT_XACD 1
#define FORMAT_SPU 2
#define FORMAT_SPUI 3
#define FORMAT_VAG 4
#define FORMAT_VAGI 5
#define FORMAT_STR2 6
#define FORMAT_STR2CD 7
#define FORMAT_SBS2 8
typedef struct {
int frame_index;
int frame_data_offset;
int frame_max_size;
int frame_block_base_overflow;
int frame_block_overflow_num;
int frame_block_overflow_den;
uint16_t bits_value;
int bits_left;
uint8_t *frame_output;
int bytes_used;
int blocks_used;
int uncomp_hwords_used;
int quant_scale;
int quant_scale_sum;
float *dct_block_lists[6];
} vid_encoder_state_t;
typedef struct {
int video_frame_dst_size;
int audio_stream_index;
int video_stream_index;
AVFormatContext* format;
AVStream* audio_stream;
AVStream* video_stream;
AVCodecContext* audio_codec_context;
AVCodecContext* video_codec_context;
struct SwrContext* resampler;
struct SwsContext* scaler;
AVFrame* frame;
int sample_count_mul;
double video_next_pts;
} av_decoder_state_t;
typedef struct {
bool quiet;
bool show_progress;
int format; // FORMAT_*
int channels;
int cd_speed; // 1 or 2
int frequency; // 18900 or 37800 Hz
int bits_per_sample; // 4 or 8
int file_number; // 00-FF
int channel_number; // 00-1F
int interleave;
int alignment;
bool loop;
int video_width;
int video_height;
int video_fps_num; // FPS numerator
int video_fps_den; // FPS denominator
bool ignore_aspect_ratio;
char *swresample_options;
char *swscale_options;
int16_t *audio_samples;
int audio_sample_count;
uint8_t *video_frames;
int video_frame_count;
av_decoder_state_t decoder_state_av;
vid_encoder_state_t state_vid;
bool end_of_input;
time_t start_time;
time_t last_progress_update;
} settings_t;
// cdrom.c
void init_sector_buffer_video(uint8_t *buffer, settings_t *settings);
void calculate_edc_data(uint8_t *buffer);
// decoding.c
bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bool use_video, bool audio_required, bool video_required);
bool poll_av_data(settings_t *settings);
bool ensure_av_data(settings_t *settings, int needed_audio_samples, int needed_video_frames);
void retire_av_data(settings_t *settings, int retired_audio_samples, int retired_video_frames);
void close_av_data(settings_t *settings);
// filefmt.c
void encode_file_spu(settings_t *settings, FILE *output);
void encode_file_spu_interleaved(settings_t *settings, FILE *output);
void encode_file_xa(settings_t *settings, FILE *output);
void encode_file_str(settings_t *settings, FILE *output);
void encode_file_sbs(settings_t *settings, FILE *output);
// mdec.c
void encode_frame_bs(uint8_t *video_frame, settings_t *settings);
void encode_sector_str(uint8_t *video_frames, uint8_t *output, settings_t *settings);

View File

@ -22,52 +22,30 @@ freely, subject to the following restrictions:
3. This notice may not be removed or altered from any source distribution. 3. This notice may not be removed or altered from any source distribution.
*/ */
#include <assert.h> #include "common.h"
#include <stdbool.h>
#include <stdio.h> int decode_frame(AVCodecContext *codec, AVFrame *frame, int *frame_size, AVPacket *packet) {
#include <stdlib.h> int ret;
#include <string.h>
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavcodec/avdct.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
#include <libswscale/swscale.h>
#include "args.h"
#include "decoding.h"
static bool decode_frame(AVCodecContext *codec, AVFrame *frame, int *frame_size, AVPacket *packet) {
if (packet != NULL) { if (packet != NULL) {
if (avcodec_send_packet(codec, packet) != 0) ret = avcodec_send_packet(codec, packet);
return false; if (ret != 0) {
return 0;
}
} }
int ret = avcodec_receive_frame(codec, frame); ret = avcodec_receive_frame(codec, frame);
if (ret >= 0) { if (ret >= 0) {
*frame_size = ret; *frame_size = ret;
return true; return 1;
} else {
return ret == AVERROR(EAGAIN) ? 1 : 0;
} }
if (ret == AVERROR(EAGAIN))
return true;
return false;
} }
bool open_av_data(decoder_t *decoder, const args_t *args, int flags) { bool open_av_data(const char *filename, settings_t *settings, bool use_audio, bool use_video, bool audio_required, bool video_required)
decoder->audio_samples = NULL; {
decoder->audio_sample_count = 0; av_decoder_state_t* av = &(settings->decoder_state_av);
decoder->video_frames = NULL;
decoder->video_frame_count = 0;
decoder->video_width = args->video_width;
decoder->video_height = args->video_height;
decoder->video_fps_num = args->str_fps_num;
decoder->video_fps_den = args->str_fps_den;
decoder->end_of_input = false;
decoder_state_t *av = &(decoder->state);
av->video_next_pts = 0.0; av->video_next_pts = 0.0;
av->frame = NULL; av->frame = NULL;
av->video_frame_dst_size = 0; av->video_frame_dst_size = 0;
@ -81,17 +59,19 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
av->resampler = NULL; av->resampler = NULL;
av->scaler = NULL; av->scaler = NULL;
if (args->flags & FLAG_QUIET) if (settings->quiet) {
av_log_set_level(AV_LOG_QUIET); av_log_set_level(AV_LOG_QUIET);
}
av->format = avformat_alloc_context(); av->format = avformat_alloc_context();
if (avformat_open_input(&(av->format), filename, NULL, NULL)) {
if (avformat_open_input(&(av->format), args->input_file, NULL, NULL))
return false; return false;
if (avformat_find_stream_info(av->format, NULL) < 0) }
if (avformat_find_stream_info(av->format, NULL) < 0) {
return false; return false;
}
if (flags & DECODER_USE_AUDIO) { if (use_audio) {
for (int i = 0; i < av->format->nb_streams; i++) { for (int i = 0; i < av->format->nb_streams; i++) {
if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) { if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
if (av->audio_stream_index >= 0) { if (av->audio_stream_index >= 0) {
@ -101,14 +81,13 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
av->audio_stream_index = i; av->audio_stream_index = i;
} }
} }
if (audio_required && av->audio_stream_index == -1) {
if ((flags & DECODER_AUDIO_REQUIRED) && av->audio_stream_index == -1) {
fprintf(stderr, "Input file has no audio data\n"); fprintf(stderr, "Input file has no audio data\n");
return false; return false;
} }
} }
if (flags & DECODER_USE_VIDEO) { if (use_video) {
for (int i = 0; i < av->format->nb_streams; i++) { for (int i = 0; i < av->format->nb_streams; i++) {
if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) { if (av->format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
if (av->video_stream_index >= 0) { if (av->video_stream_index >= 0) {
@ -118,8 +97,7 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
av->video_stream_index = i; av->video_stream_index = i;
} }
} }
if (video_required && av->video_stream_index == -1) {
if ((flags & DECODER_VIDEO_REQUIRED) && av->video_stream_index == -1) {
fprintf(stderr, "Input file has no video data\n"); fprintf(stderr, "Input file has no video data\n");
return false; return false;
} }
@ -131,39 +109,34 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
if (av->audio_stream != NULL) { if (av->audio_stream != NULL) {
const AVCodec *codec = avcodec_find_decoder(av->audio_stream->codecpar->codec_id); const AVCodec *codec = avcodec_find_decoder(av->audio_stream->codecpar->codec_id);
av->audio_codec_context = avcodec_alloc_context3(codec); av->audio_codec_context = avcodec_alloc_context3(codec);
if (av->audio_codec_context == NULL) {
if (av->audio_codec_context == NULL)
return false; return false;
if (avcodec_parameters_to_context(av->audio_codec_context, av->audio_stream->codecpar) < 0) }
if (avcodec_parameters_to_context(av->audio_codec_context, av->audio_stream->codecpar) < 0) {
return false; return false;
if (avcodec_open2(av->audio_codec_context, codec, NULL) < 0) }
if (avcodec_open2(av->audio_codec_context, codec, NULL) < 0) {
return false; return false;
}
AVChannelLayout layout; AVChannelLayout layout;
layout.nb_channels = args->audio_channels; layout.nb_channels = settings->channels;
if (settings->channels <= 2) {
if (args->audio_channels == 1) {
layout.order = AV_CHANNEL_ORDER_NATIVE; layout.order = AV_CHANNEL_ORDER_NATIVE;
layout.u.mask = AV_CH_LAYOUT_MONO; layout.u.mask = (settings->channels == 2) ? AV_CH_LAYOUT_STEREO : AV_CH_LAYOUT_MONO;
} else if (args->audio_channels == 2) {
layout.order = AV_CHANNEL_ORDER_NATIVE;
layout.u.mask = AV_CH_LAYOUT_STEREO;
} else { } else {
layout.order = AV_CHANNEL_ORDER_UNSPEC; layout.order = AV_CHANNEL_ORDER_UNSPEC;
} }
if (!settings->quiet && settings->channels > av->audio_codec_context->ch_layout.nb_channels) {
if (!(args->flags & FLAG_QUIET)) { fprintf(stderr, "Warning: input file has less than %d channels\n", settings->channels);
if (args->audio_channels > av->audio_codec_context->ch_layout.nb_channels)
fprintf(stderr, "Warning: input file has less than %d channels\n", args->audio_channels);
} }
av->sample_count_mul = args->audio_channels; av->sample_count_mul = settings->channels;
if (swr_alloc_set_opts2( if (swr_alloc_set_opts2(
&av->resampler, &av->resampler,
&layout, &layout,
AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16,
args->audio_frequency, settings->frequency,
&av->audio_codec_context->ch_layout, &av->audio_codec_context->ch_layout,
av->audio_codec_context->sample_fmt, av->audio_codec_context->sample_fmt,
av->audio_codec_context->sample_rate, av->audio_codec_context->sample_rate,
@ -172,43 +145,47 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
) < 0) { ) < 0) {
return false; return false;
} }
if (args->swresample_options) { if (settings->swresample_options) {
if (av_opt_set_from_string(av->resampler, args->swresample_options, NULL, "=", ":,") < 0) if (av_opt_set_from_string(av->resampler, settings->swresample_options, NULL, "=", ":,") < 0) {
return false; return false;
}
} }
if (swr_init(av->resampler) < 0)
if (swr_init(av->resampler) < 0) {
return false; return false;
}
} }
if (av->video_stream != NULL) { if (av->video_stream != NULL) {
const AVCodec *codec = avcodec_find_decoder(av->video_stream->codecpar->codec_id); const AVCodec *codec = avcodec_find_decoder(av->video_stream->codecpar->codec_id);
av->video_codec_context = avcodec_alloc_context3(codec); av->video_codec_context = avcodec_alloc_context3(codec);
if(av->video_codec_context == NULL) {
if (av->video_codec_context == NULL)
return false; return false;
if (avcodec_parameters_to_context(av->video_codec_context, av->video_stream->codecpar) < 0) }
if (avcodec_parameters_to_context(av->video_codec_context, av->video_stream->codecpar) < 0) {
return false; return false;
if (avcodec_open2(av->video_codec_context, codec, NULL) < 0) }
if (avcodec_open2(av->video_codec_context, codec, NULL) < 0) {
return false; return false;
if (!(args->flags & FLAG_QUIET)) {
if (
decoder->video_width > av->video_codec_context->width ||
decoder->video_height > av->video_codec_context->height
)
fprintf(stderr, "Warning: input file has resolution lower than %dx%d\n", decoder->video_width, decoder->video_height);
} }
if (!(args->flags & FLAG_BS_IGNORE_ASPECT)) { if (!settings->quiet && (
settings->video_width > av->video_codec_context->width ||
settings->video_height > av->video_codec_context->height
)) {
fprintf(stderr, "Warning: input file has resolution lower than %dx%d\n",
settings->video_width, settings->video_height
);
}
if (!settings->ignore_aspect_ratio) {
// Reduce the provided size so that it matches the input file's // Reduce the provided size so that it matches the input file's
// aspect ratio. // aspect ratio.
double src_ratio = (double)av->video_codec_context->width / (double)av->video_codec_context->height; double src_ratio = (double)av->video_codec_context->width / (double)av->video_codec_context->height;
double dst_ratio = (double)decoder->video_width / (double)decoder->video_height; double dst_ratio = (double)settings->video_width / (double)settings->video_height;
if (src_ratio < dst_ratio) { if (src_ratio < dst_ratio) {
decoder->video_width = (int)((double)decoder->video_height * src_ratio + 15.0) & ~15; settings->video_width = (int)((double)settings->video_height * src_ratio + 15.0) & ~15;
} else { } else {
decoder->video_height = (int)((double)decoder->video_width / src_ratio + 15.0) & ~15; settings->video_height = (int)((double)settings->video_width / src_ratio + 15.0) & ~15;
} }
} }
@ -216,248 +193,219 @@ bool open_av_data(decoder_t *decoder, const args_t *args, int flags) {
av->video_codec_context->width, av->video_codec_context->width,
av->video_codec_context->height, av->video_codec_context->height,
av->video_codec_context->pix_fmt, av->video_codec_context->pix_fmt,
decoder->video_width, settings->video_width,
decoder->video_height, settings->video_height,
AV_PIX_FMT_NV21, AV_PIX_FMT_NV21,
SWS_BICUBIC, SWS_BICUBIC,
NULL, NULL,
NULL, NULL,
NULL NULL
); );
if (av->scaler == NULL) if (av->scaler == NULL) {
return false; return false;
}
#if 0
// FIXME: if this is uncommented libswscale may produce completely black
// frames for whatever reason...
if (sws_setColorspaceDetails( if (sws_setColorspaceDetails(
av->scaler, av->scaler,
sws_getCoefficients(av->video_codec_context->colorspace), sws_getCoefficients(av->video_codec_context->colorspace),
av->video_codec_context->color_range == AVCOL_RANGE_JPEG, (av->video_codec_context->color_range == AVCOL_RANGE_JPEG),
sws_getCoefficients(SWS_CS_ITU601), sws_getCoefficients(SWS_CS_ITU601),
true, true,
0, 0,
1 << 16, 0,
1 << 16 0
) < 0) ) < 0) {
return false; return false;
if (args->swscale_options) { }
if (av_opt_set_from_string(av->scaler, args->swscale_options, NULL, "=", ":,") < 0) #endif
if (settings->swscale_options) {
if (av_opt_set_from_string(av->scaler, settings->swscale_options, NULL, "=", ":,") < 0) {
return false; return false;
}
} }
av->video_frame_dst_size = 3 * decoder->video_width * decoder->video_height / 2; av->video_frame_dst_size = 3*settings->video_width*settings->video_height/2;
} }
av->frame = av_frame_alloc(); av->frame = av_frame_alloc();
if (av->frame == NULL) {
if (av->frame == NULL)
return false; return false;
}
settings->audio_samples = NULL;
settings->audio_sample_count = 0;
settings->video_frames = NULL;
settings->video_frame_count = 0;
settings->end_of_input = false;
return true; return true;
} }
static void poll_av_packet_audio(decoder_t *decoder, AVPacket *packet) { static void poll_av_packet_audio(settings_t *settings, AVPacket *packet)
decoder_state_t *av = &(decoder->state); {
av_decoder_state_t* av = &(settings->decoder_state_av);
int frame_size; int frame_size, frame_sample_count;
uint8_t *buffer[1];
if (!decode_frame(av->audio_codec_context, av->frame, &frame_size, packet)) if (decode_frame(av->audio_codec_context, av->frame, &frame_size, packet)) {
return; size_t buffer_size = sizeof(int16_t) * av->sample_count_mul * swr_get_out_samples(av->resampler, av->frame->nb_samples);
buffer[0] = malloc(buffer_size);
int frame_sample_count = swr_get_out_samples(av->resampler, av->frame->nb_samples); memset(buffer[0], 0, buffer_size);
frame_sample_count = swr_convert(av->resampler, buffer, av->frame->nb_samples, (const uint8_t**)av->frame->data, av->frame->nb_samples);
if (frame_sample_count == 0) settings->audio_samples = realloc(settings->audio_samples, (settings->audio_sample_count + ((frame_sample_count + 4032) * av->sample_count_mul)) * sizeof(int16_t));
return; memmove(&(settings->audio_samples[settings->audio_sample_count]), buffer[0], sizeof(int16_t) * frame_sample_count * av->sample_count_mul);
settings->audio_sample_count += frame_sample_count * av->sample_count_mul;
size_t buffer_size = sizeof(int16_t) * av->sample_count_mul * frame_sample_count; free(buffer[0]);
uint8_t *buffer = malloc(buffer_size);
memset(buffer, 0, buffer_size);
frame_sample_count = swr_convert(
av->resampler,
&buffer,
frame_sample_count,
(const uint8_t**)av->frame->data,
av->frame->nb_samples
);
decoder->audio_samples = realloc(
decoder->audio_samples,
(decoder->audio_sample_count + ((frame_sample_count + 4032) * av->sample_count_mul)) * sizeof(int16_t)
);
memmove(
&(decoder->audio_samples[decoder->audio_sample_count]),
buffer,
sizeof(int16_t) * frame_sample_count * av->sample_count_mul
);
decoder->audio_sample_count += frame_sample_count * av->sample_count_mul;
free(buffer);
}
static void poll_av_packet_video(decoder_t *decoder, AVPacket *packet) {
decoder_state_t *av = &(decoder->state);
int frame_size;
double pts_step = (double)decoder->video_fps_den / (double)decoder->video_fps_num;
int plane_size = decoder->video_width * decoder->video_height;
int dst_strides[2] = {
decoder->video_width, decoder->video_width
};
if (!decode_frame(av->video_codec_context, av->frame, &frame_size, packet))
return;
if (!av->frame->width || !av->frame->height || !av->frame->data[0])
return;
// Some files seem to have timestamps starting from a negative value
// (but otherwise valid) for whatever reason.
double pts =
((double)av->frame->pts * (double)av->video_stream->time_base.num)
/ av->video_stream->time_base.den;
#if 0
if (pts < 0.0)
return;
#endif
if (decoder->video_frame_count >= 1 && pts < av->video_next_pts)
return;
if (decoder->video_frame_count < 1)
av->video_next_pts = pts;
else
av->video_next_pts += pts_step;
//fprintf(stderr, "%d %f %f %f\n", decoder->video_frame_count, pts, av->video_next_pts, pts_step);
// Insert duplicate frames if the frame rate of the input stream is
// lower than the target frame rate.
int dupe_frames = (int) ceil((pts - av->video_next_pts) / pts_step);
if (dupe_frames < 0) dupe_frames = 0;
decoder->video_frames = realloc(
decoder->video_frames,
(decoder->video_frame_count + dupe_frames + 1) * av->video_frame_dst_size
);
for (; dupe_frames; dupe_frames--) {
memcpy(
(decoder->video_frames) + av->video_frame_dst_size * decoder->video_frame_count,
(decoder->video_frames) + av->video_frame_dst_size * (decoder->video_frame_count - 1),
av->video_frame_dst_size
);
decoder->video_frame_count += 1;
av->video_next_pts += pts_step;
} }
uint8_t *dst_frame = decoder->video_frames + av->video_frame_dst_size * decoder->video_frame_count;
uint8_t *dst_pointers[2] = {
dst_frame, dst_frame + plane_size
};
sws_scale(
av->scaler,
(const uint8_t *const *) av->frame->data,
av->frame->linesize,
0,
av->frame->height,
dst_pointers,
dst_strides
);
decoder->video_frame_count += 1;
} }
bool poll_av_data(decoder_t *decoder) { static void poll_av_packet_video(settings_t *settings, AVPacket *packet)
decoder_state_t *av = &(decoder->state); {
av_decoder_state_t* av = &(settings->decoder_state_av);
if (decoder->end_of_input) int frame_size;
return false; double pts_step = ((double)1.0*(double)settings->video_fps_den)/(double)settings->video_fps_num;
int plane_size = settings->video_width*settings->video_height;
int dst_strides[2] = {
settings->video_width, settings->video_width
};
if (decode_frame(av->video_codec_context, av->frame, &frame_size, packet)) {
if (!av->frame->width || !av->frame->height || !av->frame->data[0]) {
return;
}
// Some files seem to have timestamps starting from a negative value
// (but otherwise valid) for whatever reason.
double pts = (((double)av->frame->pts)*(double)av->video_stream->time_base.num)/av->video_stream->time_base.den;
//if (pts < 0.0) {
//return;
//}
if (settings->video_frame_count >= 1 && pts < av->video_next_pts) {
return;
}
if ((settings->video_frame_count) < 1) {
av->video_next_pts = pts;
} else {
av->video_next_pts += pts_step;
}
//fprintf(stderr, "%d %f %f %f\n", (settings->video_frame_count), pts, av->video_next_pts, pts_step);
// Insert duplicate frames if the frame rate of the input stream is
// lower than the target frame rate.
int dupe_frames = (int) ceil((pts - av->video_next_pts) / pts_step);
if (dupe_frames < 0) dupe_frames = 0;
settings->video_frames = realloc(
settings->video_frames,
(settings->video_frame_count + dupe_frames + 1) * av->video_frame_dst_size
);
for (; dupe_frames; dupe_frames--) {
memcpy(
(settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count),
(settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count-1),
av->video_frame_dst_size
);
settings->video_frame_count += 1;
av->video_next_pts += pts_step;
}
uint8_t *dst_frame = (settings->video_frames) + av->video_frame_dst_size*(settings->video_frame_count);
uint8_t *dst_pointers[2] = {
dst_frame, dst_frame + plane_size
};
sws_scale(av->scaler, (const uint8_t *const *) av->frame->data, av->frame->linesize, 0, av->frame->height, dst_pointers, dst_strides);
settings->video_frame_count += 1;
}
}
bool poll_av_data(settings_t *settings)
{
av_decoder_state_t* av = &(settings->decoder_state_av);
AVPacket packet; AVPacket packet;
if (av_read_frame(av->format, &packet) >= 0) { if (settings->end_of_input) {
if (packet.stream_index == av->audio_stream_index) return false;
poll_av_packet_audio(decoder, &packet); }
else if (packet.stream_index == av->video_stream_index)
poll_av_packet_video(decoder, &packet);
if (av_read_frame(av->format, &packet) >= 0) {
if (packet.stream_index == av->audio_stream_index) {
poll_av_packet_audio(settings, &packet);
} else if (packet.stream_index == av->video_stream_index) {
poll_av_packet_video(settings, &packet);
}
av_packet_unref(&packet); av_packet_unref(&packet);
return true; return true;
} else { } else {
// out is always padded out with 4032 "0" samples, this makes calculations elsewhere easier // out is always padded out with 4032 "0" samples, this makes calculations elsewhere easier
if (av->audio_stream) if (av->audio_stream) {
memset( memset((settings->audio_samples) + (settings->audio_sample_count), 0, 4032 * av->sample_count_mul * sizeof(int16_t));
decoder->audio_samples + decoder->audio_sample_count, }
0,
4032 * av->sample_count_mul * sizeof(int16_t)
);
decoder->end_of_input = true; settings->end_of_input = true;
return false; return false;
} }
} }
bool ensure_av_data(decoder_t *decoder, int needed_audio_samples, int needed_video_frames) { bool ensure_av_data(settings_t *settings, int needed_audio_samples, int needed_video_frames)
// HACK: in order to update decoder->end_of_input as soon as all data has {
// been read from the input file, this loop waits for more data than while (settings->audio_sample_count < needed_audio_samples || settings->video_frame_count < needed_video_frames) {
// strictly needed. //fprintf(stderr, "ensure %d -> %d, %d -> %d\n", settings->audio_sample_count, needed_audio_samples, settings->video_frame_count, needed_video_frames);
#if 0 if (!poll_av_data(settings)) {
while (decoder->audio_sample_count < needed_audio_samples || decoder->video_frame_count < needed_video_frames) {
#else
while (
(needed_audio_samples && decoder->audio_sample_count <= needed_audio_samples) ||
(needed_video_frames && decoder->video_frame_count <= needed_video_frames)
) {
#endif
//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", decoder->audio_sample_count, needed_audio_samples, decoder->video_frame_count, needed_video_frames);
if (!poll_av_data(decoder)) {
// Keep returning true even if the end of the input file has been // Keep returning true even if the end of the input file has been
// reached, if the buffer is not yet completely empty. // reached, if the buffer is not yet completely empty.
return return (settings->audio_sample_count || !needed_audio_samples)
(decoder->audio_sample_count || !needed_audio_samples) && && (settings->video_frame_count || !needed_video_frames);
(decoder->video_frame_count || !needed_video_frames);
} }
} }
//fprintf(stderr, "ensure %d -> %d, %d -> %d\n", decoder->audio_sample_count, needed_audio_samples, decoder->video_frame_count, needed_video_frames); //fprintf(stderr, "ensure %d -> %d, %d -> %d\n", settings->audio_sample_count, needed_audio_samples, settings->video_frame_count, needed_video_frames);
return true; return true;
} }
void retire_av_data(decoder_t *decoder, int retired_audio_samples, int retired_video_frames) { void retire_av_data(settings_t *settings, int retired_audio_samples, int retired_video_frames)
//fprintf(stderr, "retire %d -> %d, %d -> %d\n", decoder->audio_sample_count, retired_audio_samples, decoder->video_frame_count, retired_video_frames); {
assert(retired_audio_samples <= decoder->audio_sample_count); av_decoder_state_t* av = &(settings->decoder_state_av);
assert(retired_video_frames <= decoder->video_frame_count);
//fprintf(stderr, "retire %d -> %d, %d -> %d\n", settings->audio_sample_count, retired_audio_samples, settings->video_frame_count, retired_video_frames);
assert(retired_audio_samples <= settings->audio_sample_count);
assert(retired_video_frames <= settings->video_frame_count);
int sample_size = sizeof(int16_t); int sample_size = sizeof(int16_t);
int frame_size = decoder->state.video_frame_dst_size; if (settings->audio_sample_count > retired_audio_samples) {
memmove(settings->audio_samples, settings->audio_samples + retired_audio_samples, (settings->audio_sample_count - retired_audio_samples)*sample_size);
}
settings->audio_sample_count -= retired_audio_samples;
if (decoder->audio_sample_count > retired_audio_samples) int frame_size = av->video_frame_dst_size;
memmove( if (settings->video_frame_count > retired_video_frames) {
decoder->audio_samples, memmove(settings->video_frames, settings->video_frames + retired_video_frames*frame_size, (settings->video_frame_count - retired_video_frames)*frame_size);
decoder->audio_samples + retired_audio_samples, }
(decoder->audio_sample_count - retired_audio_samples) * sample_size settings->video_frame_count -= retired_video_frames;
);
if (decoder->video_frame_count > retired_video_frames)
memmove(
decoder->video_frames,
decoder->video_frames + retired_video_frames * frame_size,
(decoder->video_frame_count - retired_video_frames) * frame_size
);
decoder->audio_sample_count -= retired_audio_samples;
decoder->video_frame_count -= retired_video_frames;
} }
void close_av_data(decoder_t *decoder) { void close_av_data(settings_t *settings)
decoder_state_t *av = &(decoder->state); {
av_decoder_state_t* av = &(settings->decoder_state_av);
av_frame_free(&(av->frame)); av_frame_free(&(av->frame));
swr_free(&(av->resampler)); swr_free(&(av->resampler));
// Deprecated, kept for compatibility with older FFmpeg versions.
avcodec_close(av->audio_codec_context); avcodec_close(av->audio_codec_context);
avcodec_free_context(&(av->audio_codec_context)); avcodec_free_context(&(av->audio_codec_context));
avformat_free_context(av->format); avformat_free_context(av->format);
if(decoder->audio_samples != NULL) { if(settings->audio_samples != NULL) {
free(decoder->audio_samples); free(settings->audio_samples);
decoder->audio_samples = NULL; settings->audio_samples = NULL;
} }
if(decoder->video_frames != NULL) { if(settings->video_frames != NULL) {
free(decoder->video_frames); free(settings->video_frames);
decoder->video_frames = NULL; settings->video_frames = NULL;
} }
} }

View File

@ -1,80 +0,0 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#pragma once
#include <stdbool.h>
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavcodec/avdct.h>
#include <libavformat/avformat.h>
#include <libswresample/swresample.h>
#include <libswscale/swscale.h>
#include "args.h"
typedef struct {
int video_frame_dst_size;
int audio_stream_index;
int video_stream_index;
AVFormatContext* format;
AVStream* audio_stream;
AVStream* video_stream;
AVCodecContext* audio_codec_context;
AVCodecContext* video_codec_context;
struct SwrContext* resampler;
struct SwsContext* scaler;
AVFrame* frame;
int sample_count_mul;
double video_next_pts;
} decoder_state_t;
typedef struct {
int16_t *audio_samples;
int audio_sample_count;
uint8_t *video_frames;
int video_frame_count;
int video_width;
int video_height;
int video_fps_num;
int video_fps_den;
bool end_of_input;
decoder_state_t state;
} decoder_t;
enum {
DECODER_USE_AUDIO = 1 << 0,
DECODER_USE_VIDEO = 1 << 1,
DECODER_AUDIO_REQUIRED = 1 << 2,
DECODER_VIDEO_REQUIRED = 1 << 3
};
bool open_av_data(decoder_t *decoder, const args_t *args, int flags);
bool poll_av_data(decoder_t *decoder);
bool ensure_av_data(decoder_t *decoder, int needed_audio_samples, int needed_video_frames);
void retire_av_data(decoder_t *decoder, int retired_audio_samples, int retired_video_frames);
void close_av_data(decoder_t *decoder);

View File

@ -3,7 +3,7 @@ psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg Copyright (c) 2023 spicyjpeg
This software is provided 'as-is', without any express or implied This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages warranty. In no event will the authors be held liable for any damages
@ -22,87 +22,48 @@ freely, subject to the following restrictions:
3. This notice may not be removed or altered from any source distribution. 3. This notice may not be removed or altered from any source distribution.
*/ */
#include <assert.h> #include "common.h"
#include <stdint.h> #include "libpsxav.h"
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <libpsxav.h>
#include "args.h"
#include "decoding.h"
#include "mdec.h"
static time_t start_time = 0; static time_t get_elapsed_time(settings_t *settings) {
static time_t last_progress_update = 0; if (!settings->show_progress) {
static time_t get_elapsed_time(void) {
time_t t;
if (start_time > 0) {
t = time(NULL) - start_time;
} else {
t = 0;
start_time = time(NULL);
}
if (t <= last_progress_update)
return 0; return 0;
}
last_progress_update = t; time_t t = time(NULL) - settings->start_time;
if (t <= settings->last_progress_update) {
return 0;
}
settings->last_progress_update = t;
return t; return t;
} }
static psx_audio_xa_settings_t args_to_libpsxav_xa_audio(const args_t *args) { static psx_audio_xa_settings_t settings_to_libpsxav_xa_audio(settings_t *settings) {
psx_audio_xa_settings_t settings; psx_audio_xa_settings_t new_settings;
new_settings.bits_per_sample = settings->bits_per_sample;
new_settings.frequency = settings->frequency;
new_settings.stereo = settings->channels == 2;
new_settings.file_number = settings->file_number;
new_settings.channel_number = settings->channel_number;
settings.bits_per_sample = args->audio_bit_depth; switch (settings->format) {
settings.frequency = args->audio_frequency; case FORMAT_XA:
settings.stereo = (args->audio_channels == 2); case FORMAT_STR2:
settings.file_number = args->audio_xa_file; new_settings.format = PSX_AUDIO_XA_FORMAT_XA;
settings.channel_number = args->audio_xa_channel; break;
default:
new_settings.format = PSX_AUDIO_XA_FORMAT_XACD;
break;
}
if (args->format == FORMAT_XACD || args->format == FORMAT_STRCD) return new_settings;
settings.format = PSX_AUDIO_XA_FORMAT_XACD;
else
settings.format = PSX_AUDIO_XA_FORMAT_XA;
return settings;
}; };
static void init_sector_buffer_video(const args_t *args, uint8_t *sector, int lba) { void write_vag_header(int size_per_channel, uint8_t *header, settings_t *settings) {
psx_cdrom_sector_xa_subheader_t *subheader = NULL;
if (args->format == FORMAT_STRCD) {
psx_cdrom_init_sector((psx_cdrom_sector_t *)sector, lba, PSX_CDROM_SECTOR_TYPE_MODE2_FORM1);
subheader = ((psx_cdrom_sector_t *)sector)->mode2.subheader;
} else if (args->format == FORMAT_STR) {
subheader = (psx_cdrom_sector_xa_subheader_t *)sector;
}
if (subheader != NULL) {
subheader->file = args->audio_xa_file;
subheader->channel = args->audio_xa_channel & PSX_CDROM_SECTOR_XA_CHANNEL_MASK;
subheader->submode = PSX_CDROM_SECTOR_XA_SUBMODE_DATA | PSX_CDROM_SECTOR_XA_SUBMODE_RT;
subheader->coding = 0;
memcpy(subheader + 1, subheader, sizeof(psx_cdrom_sector_xa_subheader_t));
}
}
#define VAG_HEADER_SIZE 0x30
static void write_vag_header(const args_t *args, int size_per_channel, uint8_t *header) {
memset(header, 0, VAG_HEADER_SIZE);
// Magic // Magic
header[0x00] = 'V'; header[0x00] = 'V';
header[0x01] = 'A'; header[0x01] = 'A';
header[0x02] = 'G'; header[0x02] = 'G';
header[0x03] = settings->interleave ? 'i' : 'p';
if (args->format == FORMAT_VAGI)
header[0x03] = 'i';
else
header[0x03] = 'p';
// Version (big-endian) // Version (big-endian)
header[0x04] = 0x00; header[0x04] = 0x00;
@ -111,533 +72,310 @@ static void write_vag_header(const args_t *args, int size_per_channel, uint8_t *
header[0x07] = 0x20; header[0x07] = 0x20;
// Interleave (little-endian) // Interleave (little-endian)
if (args->format == FORMAT_VAGI) { header[0x08] = (uint8_t)settings->interleave;
header[0x08] = (uint8_t)args->audio_interleave; header[0x09] = (uint8_t)(settings->interleave>>8);
header[0x09] = (uint8_t)(args->audio_interleave >> 8); header[0x0a] = (uint8_t)(settings->interleave>>16);
header[0x0A] = (uint8_t)(args->audio_interleave >> 16); header[0x0b] = (uint8_t)(settings->interleave>>24);
header[0x0B] = (uint8_t)(args->audio_interleave >> 24);
}
// Length of data for each channel (big-endian) // Length of data for each channel (big-endian)
header[0x0C] = (uint8_t)(size_per_channel >> 24); header[0x0c] = (uint8_t)(size_per_channel>>24);
header[0x0D] = (uint8_t)(size_per_channel >> 16); header[0x0d] = (uint8_t)(size_per_channel>>16);
header[0x0E] = (uint8_t)(size_per_channel >> 8); header[0x0e] = (uint8_t)(size_per_channel>>8);
header[0x0F] = (uint8_t)size_per_channel; header[0x0f] = (uint8_t)size_per_channel;
// Sample rate (big-endian) // Sample rate (big-endian)
header[0x10] = (uint8_t)(args->audio_frequency >> 24); header[0x10] = (uint8_t)(settings->frequency>>24);
header[0x11] = (uint8_t)(args->audio_frequency >> 16); header[0x11] = (uint8_t)(settings->frequency>>16);
header[0x12] = (uint8_t)(args->audio_frequency >> 8); header[0x12] = (uint8_t)(settings->frequency>>8);
header[0x13] = (uint8_t)args->audio_frequency; header[0x13] = (uint8_t)settings->frequency;
// Number of channels (little-endian) // Number of channels (little-endian)
header[0x1E] = (uint8_t)args->audio_channels; header[0x1e] = (uint8_t)settings->channels;
header[0x1F] = 0x00; header[0x1f] = 0x00;
// Filename // Filename
int name_offset = strlen(args->output_file); //strncpy(header + 0x20, "psxavenc", 16);
while ( memset(header + 0x20, 0, 16);
name_offset > 0 &&
args->output_file[name_offset - 1] != '/' &&
args->output_file[name_offset - 1] != '\\'
)
name_offset--;
strncpy((char*)(header + 0x20), &args->output_file[name_offset], 16);
} }
// The functions below are some peak spaghetti code I would rewrite if that void encode_file_spu(settings_t *settings, FILE *output) {
// didn't also require scrapping the rest of the codebase. -- spicyjpeg psx_audio_encoder_channel_state_t audio_state;
int audio_samples_per_block = psx_audio_spu_get_samples_per_block();
int block_size = psx_audio_spu_get_buffer_size_per_block();
uint8_t buffer[16];
int block_count;
void encode_file_xa(const args_t *args, decoder_t *decoder, FILE *output) {
psx_audio_xa_settings_t xa_settings = args_to_libpsxav_xa_audio(args);
int audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
psx_audio_encoder_state_t audio_state;
memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));
int sector_count = 0;
for (; ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, 0); sector_count++) {
int samples_length = decoder->audio_sample_count / args->audio_channels;
if (samples_length > audio_samples_per_sector)
samples_length = audio_samples_per_sector;
uint8_t sector[PSX_CDROM_SECTOR_SIZE];
int length = psx_audio_xa_encode(
xa_settings,
&audio_state,
decoder->audio_samples,
samples_length,
sector_count,
sector
);
if (decoder->end_of_input)
psx_audio_xa_encode_finalize(xa_settings, sector, length);
retire_av_data(decoder, samples_length * args->audio_channels, 0);
fwrite(sector, length, 1, output);
time_t t = get_elapsed_time();
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
fprintf(
stderr,
"\rLBA: %6d | Encoding speed: %5.2fx",
sector_count,
(double)(sector_count * audio_samples_per_sector) / (double)(args->audio_frequency * t)
);
}
}
}
void encode_file_spu(const args_t *args, decoder_t *decoder, FILE *output) {
psx_audio_encoder_channel_state_t audio_state;
memset(&audio_state, 0, sizeof(psx_audio_encoder_channel_state_t)); memset(&audio_state, 0, sizeof(psx_audio_encoder_channel_state_t));
// The header must be written after the data as we don't yet know the // The header must be written after the data as we don't yet know the
// number of audio samples. // number of audio samples.
if (args->format == FORMAT_VAG) if (settings->format == FORMAT_VAG) {
fseek(output, VAG_HEADER_SIZE, SEEK_SET); fseek(output, 48, SEEK_SET);
uint8_t block[PSX_AUDIO_SPU_BLOCK_SIZE];
int block_count = 0;
if (!(args->flags & FLAG_SPU_NO_LEADING_DUMMY)) {
// Insert leading silent block
memset(block, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
fwrite(block, PSX_AUDIO_SPU_BLOCK_SIZE, 1, output);
block_count++;
} }
int loop_start_block = -1; for (block_count = 0; ensure_av_data(settings, audio_samples_per_block, 0); block_count++) {
int samples_length = settings->audio_sample_count;
if (samples_length > audio_samples_per_block) samples_length = audio_samples_per_block;
if (args->audio_loop_point >= 0) int length = psx_audio_spu_encode(&audio_state, settings->audio_samples, samples_length, 1, buffer);
loop_start_block = block_count + (args->audio_loop_point * args->audio_frequency) / (PSX_AUDIO_SPU_SAMPLES_PER_BLOCK * 1000); if (!block_count) {
// This flag is not required as the SPU already resets the loop
// address when starting playback of a sample.
//buffer[1] |= PSX_AUDIO_SPU_LOOP_START;
}
if (settings->end_of_input) {
buffer[1] |= settings->loop ? PSX_AUDIO_SPU_LOOP_REPEAT : PSX_AUDIO_SPU_LOOP_END;
}
for (; ensure_av_data(decoder, PSX_AUDIO_SPU_SAMPLES_PER_BLOCK, 0); block_count++) { retire_av_data(settings, samples_length, 0);
int samples_length = decoder->audio_sample_count; fwrite(buffer, length, 1, output);
if (samples_length > PSX_AUDIO_SPU_SAMPLES_PER_BLOCK) time_t t = get_elapsed_time(settings);
samples_length = PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; if (t) {
fprintf(stderr, "\rBlock: %6d | Encoding speed: %5.2fx",
int length = psx_audio_spu_encode(
&audio_state,
decoder->audio_samples,
samples_length,
1,
block
);
if (block_count == loop_start_block)
block[1] |= PSX_AUDIO_SPU_LOOP_START;
if ((args->flags & FLAG_SPU_LOOP_END) && decoder->end_of_input)
block[1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
retire_av_data(decoder, samples_length, 0);
fwrite(block, length, 1, output);
time_t t = get_elapsed_time();
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
fprintf(
stderr,
"\rBlock: %6d | Encoding speed: %5.2fx",
block_count, block_count,
(double)(block_count * PSX_AUDIO_SPU_SAMPLES_PER_BLOCK) / (double)(args->audio_frequency * t) (double)(block_count*audio_samples_per_block) / (double)(settings->frequency*t)
); );
} }
} }
if (!(args->flags & FLAG_SPU_LOOP_END)) { if (settings->format == FORMAT_VAG) {
// Insert trailing looping block uint8_t header[48];
memset(block, 0, PSX_AUDIO_SPU_BLOCK_SIZE); memset(header, 0, 48);
block[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END; write_vag_header(block_count*block_size, header, settings);
fwrite(block, PSX_AUDIO_SPU_BLOCK_SIZE, 1, output);
block_count++;
}
int overflow = (block_count * PSX_AUDIO_SPU_BLOCK_SIZE) % args->alignment;
if (overflow) {
for (int i = 0; i < (args->alignment - overflow); i++)
fputc(0, output);
}
if (args->format == FORMAT_VAG) {
uint8_t header[VAG_HEADER_SIZE];
write_vag_header(args, block_count * PSX_AUDIO_SPU_BLOCK_SIZE, header);
fseek(output, 0, SEEK_SET); fseek(output, 0, SEEK_SET);
fwrite(header, VAG_HEADER_SIZE, 1, output); fwrite(header, 48, 1, output);
} }
} }
void encode_file_spui(const args_t *args, decoder_t *decoder, FILE *output) { void encode_file_spu_interleaved(settings_t *settings, FILE *output) {
int audio_samples_per_chunk = args->audio_interleave / PSX_AUDIO_SPU_BLOCK_SIZE * PSX_AUDIO_SPU_SAMPLES_PER_BLOCK; int audio_state_size = sizeof(psx_audio_encoder_channel_state_t) * settings->channels;
// NOTE: since the interleaved .vag format is not standardized, some tools // NOTE: since the interleaved .vag format is not standardized, some tools
// (such as vgmstream) will not properly play files with interleave < 2048, // (such as vgmstream) will not properly play files with interleave < 2048,
// alignment != 2048 or channels != 2. // alignment != 2048 or channels != 2.
int chunk_size = args->audio_interleave * args->audio_channels + args->alignment - 1; int buffer_size = settings->interleave + settings->alignment - 1;
chunk_size -= chunk_size % args->alignment; buffer_size -= buffer_size % settings->alignment;
int header_size = 48 + settings->alignment - 1;
header_size -= header_size % settings->alignment;
int header_size = VAG_HEADER_SIZE + args->alignment - 1;
header_size -= header_size % args->alignment;
if (args->format == FORMAT_VAGI)
fseek(output, header_size, SEEK_SET);
int audio_state_size = sizeof(psx_audio_encoder_channel_state_t) * args->audio_channels;
psx_audio_encoder_channel_state_t *audio_state = malloc(audio_state_size); psx_audio_encoder_channel_state_t *audio_state = malloc(audio_state_size);
uint8_t *buffer = malloc(buffer_size);
int audio_samples_per_block = psx_audio_spu_get_samples_per_block();
int block_size = psx_audio_spu_get_buffer_size_per_block();
int audio_samples_per_chunk = settings->interleave / block_size * audio_samples_per_block;
int chunk_count;
memset(audio_state, 0, audio_state_size); memset(audio_state, 0, audio_state_size);
uint8_t *chunk = malloc(chunk_size); if (settings->format == FORMAT_VAGI) {
int chunk_count = 0; fseek(output, header_size, SEEK_SET);
}
for (; ensure_av_data(decoder, audio_samples_per_chunk * args->audio_channels, 0); chunk_count++) { for (chunk_count = 0; ensure_av_data(settings, audio_samples_per_chunk*settings->channels, 0); chunk_count++) {
int samples_length = decoder->audio_sample_count / args->audio_channels; int samples_length = settings->audio_sample_count / settings->channels;
if (samples_length > audio_samples_per_chunk) samples_length = audio_samples_per_chunk;
if (samples_length > audio_samples_per_chunk) for (int ch = 0; ch < settings->channels; ch++) {
samples_length = audio_samples_per_chunk; memset(buffer, 0, buffer_size);
int length = psx_audio_spu_encode(audio_state + ch, settings->audio_samples + ch, samples_length, settings->channels, buffer);
memset(chunk, 0, chunk_size); if (length) {
uint8_t *chunk_ptr = chunk; //buffer[1] |= PSX_AUDIO_SPU_LOOP_START;
if (settings->loop) {
// Insert leading silent block buffer[length - block_size + 1] |= PSX_AUDIO_SPU_LOOP_REPEAT;
if (chunk_count == 0 && !(args->flags & FLAG_SPU_NO_LEADING_DUMMY)) {
chunk_ptr += PSX_AUDIO_SPU_BLOCK_SIZE;
samples_length -= PSX_AUDIO_SPU_SAMPLES_PER_BLOCK;
}
for (int ch = 0; ch < args->audio_channels; ch++, chunk_ptr += args->audio_interleave) {
int length = psx_audio_spu_encode(
audio_state + ch,
decoder->audio_samples + ch,
samples_length,
args->audio_channels,
chunk_ptr
);
if (length > 0) {
uint8_t *last_block = chunk_ptr + length - PSX_AUDIO_SPU_BLOCK_SIZE;
if (args->flags & FLAG_SPU_LOOP_END) {
last_block[1] = PSX_AUDIO_SPU_LOOP_REPEAT;
} else if (decoder->end_of_input) {
// HACK: the trailing block should in theory be appended to
// the existing data, but it's easier to just zerofill and
// repurpose the last encoded block
memset(last_block, 0, PSX_AUDIO_SPU_BLOCK_SIZE);
last_block[1] = PSX_AUDIO_SPU_LOOP_START | PSX_AUDIO_SPU_LOOP_END;
} }
if (settings->end_of_input) {
buffer[length - block_size + 1] |= PSX_AUDIO_SPU_LOOP_END;
}
}
fwrite(buffer, buffer_size, 1, output);
time_t t = get_elapsed_time(settings);
if (t) {
fprintf(stderr, "\rChunk: %6d | Encoding speed: %5.2fx",
chunk_count,
(double)(chunk_count*audio_samples_per_chunk) / (double)(settings->frequency*t)
);
} }
} }
retire_av_data(decoder, samples_length * args->audio_channels, 0); retire_av_data(settings, samples_length*settings->channels, 0);
fwrite(chunk, chunk_size, 1, output);
time_t t = get_elapsed_time();
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
fprintf(
stderr,
"\rChunk: %6d | Encoding speed: %5.2fx",
chunk_count,
(double)(chunk_count * audio_samples_per_chunk) / (double)(args->audio_frequency * t)
);
}
} }
free(audio_state); if (settings->format == FORMAT_VAGI) {
free(chunk);
if (args->format == FORMAT_VAGI) {
uint8_t *header = malloc(header_size); uint8_t *header = malloc(header_size);
memset(header, 0, header_size); memset(header, 0, header_size);
write_vag_header(args, chunk_count * args->audio_interleave, header); write_vag_header(chunk_count*settings->interleave, header, settings);
fseek(output, 0, SEEK_SET); fseek(output, 0, SEEK_SET);
fwrite(header, header_size, 1, output); fwrite(header, header_size, 1, output);
free(header); free(header);
} }
free(audio_state);
free(buffer);
} }
void encode_file_str(const args_t *args, decoder_t *decoder, FILE *output) { void encode_file_xa(settings_t *settings, FILE *output) {
psx_audio_xa_settings_t xa_settings = args_to_libpsxav_xa_audio(args); psx_audio_xa_settings_t xa_settings = settings_to_libpsxav_xa_audio(settings);
int sector_size = psx_audio_xa_get_buffer_size_per_sector(xa_settings); psx_audio_encoder_state_t audio_state;
int audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
uint8_t buffer[2352];
int interleave;
int audio_samples_per_sector;
int video_sectors_per_block;
if (decoder->state.audio_stream != NULL) {
// 1/N audio, (N-1)/N video
interleave = psx_audio_xa_get_sector_interleave(xa_settings) * args->str_cd_speed;
audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
video_sectors_per_block = interleave - 1;
if (!(args->flags & FLAG_QUIET))
fprintf(
stderr,
"Interleave: %d/%d audio, %d/%d video\n",
interleave - video_sectors_per_block,
interleave,
video_sectors_per_block,
interleave
);
} else {
// 0/1 audio, 1/1 video
interleave = 1;
audio_samples_per_sector = 0;
video_sectors_per_block = 1;
}
psx_audio_encoder_state_t audio_state;
memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t)); memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));
mdec_encoder_t encoder; for (int j = 0; ensure_av_data(settings, audio_samples_per_sector*settings->channels, 0); j++) {
init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height); int samples_length = settings->audio_sample_count / settings->channels;
if (samples_length > audio_samples_per_sector) samples_length = audio_samples_per_sector;
// e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame int length = psx_audio_xa_encode(xa_settings, &audio_state, settings->audio_samples, samples_length, buffer);
encoder.state.frame_block_base_overflow = (75 * args->str_cd_speed) * video_sectors_per_block * args->str_fps_den; if (settings->end_of_input) {
encoder.state.frame_block_overflow_den = interleave * args->str_fps_num; psx_audio_xa_encode_finalize(xa_settings, buffer, length);
double frame_size = (double)encoder.state.frame_block_base_overflow / (double)encoder.state.frame_block_overflow_den;
if (!(args->flags & FLAG_QUIET))
fprintf(stderr, "Frame size: %.2f sectors\n", frame_size);
encoder.state.frame_output = malloc(2016 * (int)ceil(frame_size));
encoder.state.frame_index = 0;
encoder.state.frame_data_offset = 0;
encoder.state.frame_max_size = 0;
encoder.state.frame_block_overflow_num = 0;
encoder.state.quant_scale_sum = 0;
// FIXME: this needs an extra frame to prevent A/V desync
int frames_needed = (int) ceil((double)video_sectors_per_block / frame_size);
if (frames_needed < 2)
frames_needed = 2;
int sector_count = 0;
for (; !decoder->end_of_input || encoder.state.frame_data_offset < encoder.state.frame_max_size; sector_count++) {
ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, frames_needed);
uint8_t sector[PSX_CDROM_SECTOR_SIZE];
bool is_video_sector;
if (audio_samples_per_sector == 0)
is_video_sector = true;
else if (args->flags & FLAG_STR_TRAILING_AUDIO)
is_video_sector = (sector_count % interleave) < video_sectors_per_block;
else
is_video_sector = (sector_count % interleave) > 0;
if (is_video_sector) {
init_sector_buffer_video(args, sector, sector_count);
int frames_used = encode_sector_str(
&encoder,
args->format,
args->str_video_id,
decoder->video_frames,
sector
);
psx_cdrom_calculate_checksums((psx_cdrom_sector_t *)sector, PSX_CDROM_SECTOR_TYPE_MODE2_FORM1);
retire_av_data(decoder, 0, frames_used);
} else {
int samples_length = decoder->audio_sample_count / args->audio_channels;
if (samples_length > audio_samples_per_sector)
samples_length = audio_samples_per_sector;
// FIXME: this is an extremely hacky way to handle audio tracks
// shorter than the video track
if (!samples_length)
video_sectors_per_block++;
int length = psx_audio_xa_encode(
xa_settings,
&audio_state,
decoder->audio_samples,
samples_length,
sector_count,
sector
);
if (decoder->end_of_input)
psx_audio_xa_encode_finalize(xa_settings, sector, length);
retire_av_data(decoder, samples_length * args->audio_channels, 0);
} }
fwrite(sector, sector_size, 1, output); if (settings->format == FORMAT_XACD) {
int t = j + 75*2;
time_t t = get_elapsed_time(); // Put the time in
buffer[0x00C] = ((t/75/60)%10)|(((t/75/60)/10)<<4);
buffer[0x00D] = (((t/75)%60)%10)|((((t/75)%60)/10)<<4);
buffer[0x00E] = ((t%75)%10)|(((t%75)/10)<<4);
}
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) { retire_av_data(settings, samples_length*settings->channels, 0);
fprintf( fwrite(buffer, length, 1, output);
stderr,
"\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx", time_t t = get_elapsed_time(settings);
encoder.state.frame_index, if (t) {
sector_count, fprintf(stderr, "\rLBA: %6d | Encoding speed: %5.2fx",
(double)encoder.state.quant_scale_sum / (double)encoder.state.frame_index, j,
(double)(encoder.state.frame_index * args->str_fps_den) / (double)(t * args->str_fps_num) (double)(j*audio_samples_per_sector) / (double)(settings->frequency*t)
); );
} }
} }
free(encoder.state.frame_output);
destroy_mdec_encoder(&encoder);
} }
void encode_file_strspu(const args_t *args, decoder_t *decoder, FILE *output) { void encode_file_str(settings_t *settings, FILE *output) {
int interleave; psx_audio_xa_settings_t xa_settings = settings_to_libpsxav_xa_audio(settings);
psx_audio_encoder_state_t audio_state;
int audio_samples_per_sector; int audio_samples_per_sector;
uint8_t buffer[2352];
int interleave;
int video_sectors_per_block; int video_sectors_per_block;
if (settings->decoder_state_av.audio_stream) {
if (decoder->state.audio_stream != NULL) { // 1/N audio, (N-1)/N video
assert(false); // TODO: implement audio_samples_per_sector = psx_audio_xa_get_samples_per_sector(xa_settings);
interleave = psx_audio_xa_get_sector_interleave(xa_settings) * settings->cd_speed;
if (!(args->flags & FLAG_QUIET)) video_sectors_per_block = interleave - 1;
fprintf(
stderr,
"Interleave: %d/%d audio, %d/%d video\n",
interleave - video_sectors_per_block,
interleave,
video_sectors_per_block,
interleave
);
} else { } else {
// 0/1 audio, 1/1 video // 0/1 audio, 1/1 video
interleave = 1;
audio_samples_per_sector = 0; audio_samples_per_sector = 0;
interleave = 1;
video_sectors_per_block = 1; video_sectors_per_block = 1;
} }
mdec_encoder_t encoder; if (!settings->quiet) {
init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height); fprintf(stderr, "Interleave: %d/%d audio, %d/%d video\n",
interleave - video_sectors_per_block, interleave, video_sectors_per_block, interleave);
}
memset(&audio_state, 0, sizeof(psx_audio_encoder_state_t));
// e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame // e.g. 15fps = (150*7/8/15) = 8.75 blocks per frame
encoder.state.frame_block_base_overflow = (75 * args->str_cd_speed) * video_sectors_per_block * args->str_fps_den; settings->state_vid.frame_block_base_overflow = (75*settings->cd_speed) * video_sectors_per_block * settings->video_fps_den;
encoder.state.frame_block_overflow_den = interleave * args->str_fps_num; settings->state_vid.frame_block_overflow_den = interleave * settings->video_fps_num;
double frame_size = (double)encoder.state.frame_block_base_overflow / (double)encoder.state.frame_block_overflow_den; double frame_size = (double)settings->state_vid.frame_block_base_overflow / (double)settings->state_vid.frame_block_overflow_den;
if (!settings->quiet) {
if (!(args->flags & FLAG_QUIET))
fprintf(stderr, "Frame size: %.2f sectors\n", frame_size); fprintf(stderr, "Frame size: %.2f sectors\n", frame_size);
}
encoder.state.frame_output = malloc(2016 * (int)ceil(frame_size)); settings->state_vid.frame_output = malloc(2016 * (int)ceil(frame_size));
encoder.state.frame_index = 0; settings->state_vid.frame_index = 0;
encoder.state.frame_data_offset = 0; settings->state_vid.frame_data_offset = 0;
encoder.state.frame_max_size = 0; settings->state_vid.frame_max_size = 0;
encoder.state.frame_block_overflow_num = 0; settings->state_vid.frame_block_overflow_num = 0;
encoder.state.quant_scale_sum = 0; settings->state_vid.quant_scale_sum = 0;
// FIXME: this needs an extra frame to prevent A/V desync // FIXME: this needs an extra frame to prevent A/V desync
int frames_needed = (int) ceil((double)video_sectors_per_block / frame_size); int frames_needed = (int) ceil((double)video_sectors_per_block / frame_size);
if (frames_needed < 2) frames_needed = 2;
if (frames_needed < 2) for (int j = 0; !settings->end_of_input || settings->state_vid.frame_data_offset < settings->state_vid.frame_max_size; j++) {
frames_needed = 2; ensure_av_data(settings, audio_samples_per_sector*settings->channels, frames_needed);
int sector_count = 0; if ((j%interleave) < video_sectors_per_block) {
// Video sector
for (; !decoder->end_of_input || encoder.state.frame_data_offset < encoder.state.frame_max_size; sector_count++) { init_sector_buffer_video(buffer, settings);
ensure_av_data(decoder, audio_samples_per_sector * args->audio_channels, frames_needed); encode_sector_str(settings->video_frames, buffer, settings);
uint8_t sector[2048];
bool is_video_sector;
if (audio_samples_per_sector == 0)
is_video_sector = true;
else if (args->flags & FLAG_STR_TRAILING_AUDIO)
is_video_sector = (sector_count % interleave) < video_sectors_per_block;
else
is_video_sector = (sector_count % interleave) > 0;
if (is_video_sector) {
init_sector_buffer_video(args, sector, sector_count);
int frames_used = encode_sector_str(
&encoder,
args->format,
args->str_video_id,
decoder->video_frames,
sector
);
retire_av_data(decoder, 0, frames_used);
} else { } else {
int samples_length = decoder->audio_sample_count / args->audio_channels; // Audio sector
int samples_length = settings->audio_sample_count / settings->channels;
if (samples_length > audio_samples_per_sector) if (samples_length > audio_samples_per_sector) samples_length = audio_samples_per_sector;
samples_length = audio_samples_per_sector;
// FIXME: this is an extremely hacky way to handle audio tracks // FIXME: this is an extremely hacky way to handle audio tracks
// shorter than the video track // shorter than the video track
if (!samples_length) if (!samples_length) {
video_sectors_per_block++; video_sectors_per_block++;
}
assert(false); // TODO: implement int length = psx_audio_xa_encode(xa_settings, &audio_state, settings->audio_samples, samples_length, buffer);
if (settings->end_of_input) {
retire_av_data(decoder, samples_length * args->audio_channels, 0); psx_audio_xa_encode_finalize(xa_settings, buffer, length);
}
retire_av_data(settings, samples_length*settings->channels, 0);
} }
fwrite(sector, 2048, 1, output); if (settings->format == FORMAT_STR2CD) {
int t = j + 75*2;
time_t t = get_elapsed_time(); // Put the time in
buffer[0x00C] = ((t/75/60)%10)|(((t/75/60)/10)<<4);
buffer[0x00D] = (((t/75)%60)%10)|((((t/75)%60)/10)<<4);
buffer[0x00E] = ((t%75)%10)|(((t%75)/10)<<4);
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) { // FIXME: EDC is not calculated in 2336-byte sector mode (shouldn't
fprintf( // matter anyway, any CD image builder will have to recalculate it
stderr, // due to the sector's MSF changing)
"\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx", if((j%interleave) < video_sectors_per_block) {
encoder.state.frame_index, calculate_edc_data(buffer);
sector_count, }
(double)encoder.state.quant_scale_sum / (double)encoder.state.frame_index,
(double)(encoder.state.frame_index * args->str_fps_den) / (double)(t * args->str_fps_num)
);
} }
}
free(encoder.state.frame_output); fwrite(buffer, 2352, 1, output);
destroy_mdec_encoder(&encoder);
}
void encode_file_sbs(const args_t *args, decoder_t *decoder, FILE *output) { time_t t = get_elapsed_time(settings);
mdec_encoder_t encoder; if (t) {
init_mdec_encoder(&encoder, args->video_codec, args->video_width, args->video_height); fprintf(stderr, "\rFrame: %4d | LBA: %6d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
settings->state_vid.frame_index,
encoder.state.frame_output = malloc(args->alignment);
encoder.state.frame_data_offset = 0;
encoder.state.frame_max_size = args->alignment;
encoder.state.quant_scale_sum = 0;
for (int j = 0; ensure_av_data(decoder, 0, 1); j++) {
encode_frame_bs(&encoder, decoder->video_frames);
retire_av_data(decoder, 0, 1);
fwrite(encoder.state.frame_output, args->alignment, 1, output);
time_t t = get_elapsed_time();
if (!(args->flags & FLAG_HIDE_PROGRESS) && t) {
fprintf(
stderr,
"\rFrame: %4d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
j, j,
(double)encoder.state.quant_scale_sum / (double)j, (double)settings->state_vid.quant_scale_sum / (double)settings->state_vid.frame_index,
(double)(j * args->str_fps_den) / (double)(t * args->str_fps_num) (double)(settings->state_vid.frame_index*settings->video_fps_den) / (double)(t*settings->video_fps_num)
); );
} }
} }
free(encoder.state.frame_output); free(settings->state_vid.frame_output);
destroy_mdec_encoder(&encoder); }
void encode_file_sbs(settings_t *settings, FILE *output) {
settings->state_vid.frame_output = malloc(settings->alignment);
settings->state_vid.frame_data_offset = 0;
settings->state_vid.frame_max_size = settings->alignment;
settings->state_vid.quant_scale_sum = 0;
for (int j = 0; ensure_av_data(settings, 0, 2); j++) {
encode_frame_bs(settings->video_frames, settings);
fwrite(settings->state_vid.frame_output, settings->alignment, 1, output);
time_t t = get_elapsed_time(settings);
if (t) {
fprintf(stderr, "\rFrame: %4d | Avg. q. scale: %5.2f | Encoding speed: %5.2fx",
j,
(double)settings->state_vid.quant_scale_sum / (double)j,
(double)(j*settings->video_fps_den) / (double)(t*settings->video_fps_num)
);
}
}
free(settings->state_vid.frame_output);
} }

View File

@ -1,201 +0,0 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#include <stdint.h>
#include <stdio.h>
#include "args.h"
#include "decoding.h"
#include "filefmt.h"
static const char *const bs_codec_names[NUM_BS_CODECS] = {
"BS v2",
"BS v3",
"BS v3 (with DC wrapping)"
};
static const uint8_t decoder_flags[NUM_FORMATS] = {
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // xa
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // xacd
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // spu
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // vag
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // spui
DECODER_USE_AUDIO | DECODER_AUDIO_REQUIRED, // vagi
DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // str
DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strcd
DECODER_USE_AUDIO | DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strspu
DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED, // strv
DECODER_USE_VIDEO | DECODER_VIDEO_REQUIRED // sbs
};
int main(int argc, const char **argv) {
args_t args;
decoder_t decoder;
FILE *output;
args.flags = 0;
args.format = FORMAT_INVALID;
args.input_file = NULL;
args.output_file = NULL;
args.swresample_options = NULL;
args.swscale_options = NULL;
if (!parse_args(&args, argv + 1, argc - 1))
return 1;
if (!open_av_data(&decoder, &args, decoder_flags[args.format])) {
fprintf(stderr, "Failed to open input file: %s\n", args.input_file);
return 1;
}
output = fopen(args.output_file, "wb");
if (output == NULL) {
fprintf(stderr, "Failed to open output file: %s\n", args.output_file);
return 1;
}
switch (args.format) {
case FORMAT_XA:
case FORMAT_XACD:
if (!(args.flags & FLAG_QUIET))
fprintf(
stderr,
"Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
args.audio_frequency,
args.audio_bit_depth,
(args.audio_channels == 2) ? "stereo" : "mono",
args.audio_xa_file,
args.audio_xa_channel
);
encode_file_xa(&args, &decoder, output);
break;
case FORMAT_SPU:
case FORMAT_VAG:
if (!(args.flags & FLAG_QUIET))
fprintf(
stderr,
"Audio format: SPU-ADPCM, %d Hz mono\n",
args.audio_frequency
);
encode_file_spu(&args, &decoder, output);
break;
case FORMAT_SPUI:
case FORMAT_VAGI:
if (!(args.flags & FLAG_QUIET))
fprintf(
stderr,
"Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
args.audio_frequency,
args.audio_channels,
args.audio_interleave
);
encode_file_spui(&args, &decoder, output);
break;
case FORMAT_STR:
case FORMAT_STRCD:
if (!(args.flags & FLAG_QUIET)) {
if (decoder.state.audio_stream)
fprintf(
stderr,
"Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
args.audio_frequency,
args.audio_bit_depth,
(args.audio_channels == 2) ? "stereo" : "mono",
args.audio_xa_file,
args.audio_xa_channel
);
fprintf(
stderr,
"Video format: %s, %dx%d, %.2f fps\n",
bs_codec_names[args.video_codec],
args.video_width,
args.video_height,
(double)args.str_fps_num / (double)args.str_fps_den
);
}
encode_file_str(&args, &decoder, output);
break;
case FORMAT_STRSPU:
// TODO: implement and remove this check
fprintf(stderr, "This format is not currently supported\n");
break;
case FORMAT_STRV:
if (!(args.flags & FLAG_QUIET)) {
if (decoder.state.audio_stream)
fprintf(
stderr,
"Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
args.audio_frequency,
args.audio_channels,
args.audio_interleave
);
fprintf(
stderr,
"Video format: %s, %dx%d, %.2f fps\n",
bs_codec_names[args.video_codec],
args.video_width,
args.video_height,
(double)args.str_fps_num / (double)args.str_fps_den
);
}
encode_file_strspu(&args, &decoder, output);
break;
case FORMAT_SBS:
if (!(args.flags & FLAG_QUIET))
fprintf(
stderr,
"Video format: %s, %dx%d, %.2f fps\n",
bs_codec_names[args.video_codec],
args.video_width,
args.video_height,
(double)args.str_fps_num / (double)args.str_fps_den
);
encode_file_sbs(&args, &decoder, output);
break;
default:
;
}
if (!(args.flags & FLAG_HIDE_PROGRESS))
fprintf(stderr, "\nDone.\n");
fclose(output);
close_av_data(&decoder);
return 0;
}

File diff suppressed because it is too large Load Diff

View File

@ -1,74 +0,0 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023, 2025 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#pragma once
#include <stdbool.h>
#include <stdint.h>
#include <libavcodec/avdct.h>
#include "args.h"
typedef struct {
int frame_index;
int frame_data_offset;
int frame_max_size;
int frame_block_base_overflow;
int frame_block_overflow_num;
int frame_block_overflow_den;
int block_type;
int16_t last_dc_values[3];
uint16_t bits_value;
int bits_left;
uint8_t *frame_output;
int bytes_used;
int blocks_used;
int uncomp_hwords_used;
int quant_scale;
int quant_scale_sum;
AVDCT *dct_context;
uint32_t *ac_huffman_map;
uint32_t *dc_huffman_map;
int16_t *coeff_clamp_map;
int16_t *dct_block_lists[6];
} mdec_encoder_state_t;
typedef struct {
bs_codec_t video_codec;
int video_width;
int video_height;
mdec_encoder_state_t state;
} mdec_encoder_t;
bool init_mdec_encoder(mdec_encoder_t *encoder, bs_codec_t video_codec, int video_width, int video_height);
void destroy_mdec_encoder(mdec_encoder_t *encoder);
void encode_frame_bs(mdec_encoder_t *encoder, const uint8_t *video_frame);
int encode_sector_str(
mdec_encoder_t *encoder,
format_t format,
uint16_t str_video_id,
const uint8_t *video_frames,
uint8_t *output
);

417
psxavenc/psxavenc.c Normal file
View File

@ -0,0 +1,417 @@
/*
psxavenc: MDEC video + SPU/XA-ADPCM audio encoder frontend
Copyright (c) 2019, 2020 Adrian "asie" Siekierka
Copyright (c) 2019 Ben "GreaseMonkey" Russell
Copyright (c) 2023 spicyjpeg
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
*/
#include "common.h"
const char *format_names[NUM_FORMATS] = {
"xa", "xacd",
"spu", "spui",
"vag", "vagi",
"str2", "str2cd",
"sbs2"
};
void print_help(void) {
fprintf(stderr,
"Usage:\n"
" psxavenc -t <xa|xacd> [-f 18900|37800] [-b 4|8] [-c 1|2] [-F 0-255] [-C 0-31] <in> <out.xa>\n"
" psxavenc -t <str2|str2cd> [-f 18900|37800] [-b 4|8] [-c 1|2] [-F 0-255] [-C 0-31] [-s WxH] [-I] [-r num/den] [-x 1|2] <in> <out.str>\n"
" psxavenc -t sbs2 [-s WxH] [-I] [-r num/den] [-a size] <in> <out.str>\n"
" psxavenc -t <spu|vag> [-f freq] [-L] <in> <out.vag>\n"
" psxavenc -t <spui|vagi> [-f freq] [-c 1-24] [-L] [-i size] [-a size] <in> <out.vag>\n"
"\nTool options:\n"
" -h Show this help message and exit\n"
" -q Suppress all non-error messages\n"
"\nOutput options:\n"
" -t format Use specified output type:\n"
" xa [A.] .xa, 2336-byte sectors\n"
" xacd [A.] .xa, 2352-byte sectors\n"
" spu [A.] raw SPU-ADPCM mono data\n"
" spui [A.] raw SPU-ADPCM interleaved data\n"
" vag [A.] .vag SPU-ADPCM mono\n"
" vagi [A.] .vag SPU-ADPCM interleaved\n"
" str2 [AV] v2 .str video, 2336-byte sectors\n"
" str2cd [AV] v2 .str video, 2352-byte sectors\n"
" sbs2 [.V] v2 .sbs video, 2048-byte sectors\n"
" -F num Set the XA file number for xa/str2 (0-255)\n"
" -C num Set the XA channel number for xa/str2 (0-31)\n"
"\nAudio options:\n"
" -f freq Use specified sample rate (must be 18900 or 37800 for xa/str2)\n"
" -b bitdepth Use specified bit depth for xa/str2 (4 or 8)\n"
" -c channels Use specified channel count (1-2 for xa/str2, any for spui/vagi)\n"
" -L Add a loop marker at the end of SPU-ADPCM data\n"
" -R key=value,... Pass custom options to libswresample (see ffmpeg docs)\n"
"\nSPU interleaving options (spui/vagi format):\n"
" -i size Use specified interleave\n"
" -a size Pad header and each interleaved chunk to specified size\n"
"\nVideo options (str2/str2cd/sbs2 format):\n"
" -s WxH Rescale input file to fit within specified size (default 320x240)\n"
" -I Force stretching to given size without preserving aspect ratio\n"
" -S key=value,... Pass custom options to libswscale (see ffmpeg docs)\n"
" -r num/den Set frame rate to specified integer or fraction (default 15)\n"
" -x speed Set the CD-ROM speed the file is meant to played at (1-2)\n"
" -a size Set the size of each frame for sbs2\n"
);
}
int parse_args(settings_t* settings, int argc, char** argv) {
int c, i;
char *next;
while ((c = getopt(argc, argv, "?hqt:F:C:f:b:c:LR:i:a:s:IS:r:x:")) != -1) {
switch (c) {
case '?':
case 'h': {
print_help();
return -1;
} break;
case 'q': {
settings->quiet = true;
settings->show_progress = false;
} break;
case 't': {
settings->format = -1;
for (i = 0; i < NUM_FORMATS; i++) {
if (!strcmp(optarg, format_names[i])) {
settings->format = i;
break;
}
}
if (settings->format < 0) {
fprintf(stderr, "Invalid format: %s\n", optarg);
return -1;
}
} break;
case 'F': {
settings->file_number = strtol(optarg, NULL, 0);
if (settings->file_number < 0 || settings->file_number > 255) {
fprintf(stderr, "Invalid file number: %d\n", settings->file_number);
return -1;
}
} break;
case 'C': {
settings->channel_number = strtol(optarg, NULL, 0);
if (settings->channel_number < 0 || settings->channel_number > 31) {
fprintf(stderr, "Invalid channel number: %d\n", settings->channel_number);
return -1;
}
} break;
case 'f': {
settings->frequency = strtol(optarg, NULL, 0);
} break;
case 'b': {
settings->bits_per_sample = strtol(optarg, NULL, 0);
if (settings->bits_per_sample != 4 && settings->bits_per_sample != 8) {
fprintf(stderr, "Invalid bit depth: %d\n", settings->frequency);
return -1;
}
} break;
case 'c': {
settings->channels = strtol(optarg, NULL, 0);
if (settings->channels < 1 || settings->channels > 24) {
fprintf(stderr, "Invalid channel count: %d\n", settings->channels);
return -1;
}
} break;
case 'L': {
settings->loop = true;
} break;
case 'R': {
settings->swresample_options = optarg;
} break;
case 'i': {
settings->interleave = (strtol(optarg, NULL, 0) + 15) & ~15;
if (settings->interleave < 16) {
fprintf(stderr, "Invalid interleave: %d\n", settings->interleave);
return -1;
}
} break;
case 'a': {
settings->alignment = strtol(optarg, NULL, 0);
if (settings->alignment < 1) {
fprintf(stderr, "Invalid alignment: %d\n", settings->alignment);
return -1;
}
} break;
case 's': {
settings->video_width = (strtol(optarg, &next, 0) + 15) & ~15;
if (*next != 'x') {
fprintf(stderr, "Invalid video size (must be specified as <width>x<height>)\n");
return -1;
}
settings->video_height = (strtol(next + 1, NULL, 0) + 15) & ~15;
if (settings->video_width < 16 || settings->video_width > 320) {
fprintf(stderr, "Invalid video width: %d\n", settings->video_width);
return -1;
}
if (settings->video_height < 16 || settings->video_height > 240) {
fprintf(stderr, "Invalid video height: %d\n", settings->video_height);
return -1;
}
} break;
case 'I': {
settings->ignore_aspect_ratio = true;
} break;
case 'S': {
settings->swscale_options = optarg;
} break;
case 'r': {
settings->video_fps_num = strtol(optarg, &next, 0);
if (*next == '/') {
settings->video_fps_den = strtol(next + 1, NULL, 0);
} else {
settings->video_fps_den = 1;
}
if (!settings->video_fps_den) {
fprintf(stderr, "Invalid frame rate denominator\n");
return -1;
}
i = settings->video_fps_num / settings->video_fps_den;
if (i < 1 || i > 30) {
fprintf(stderr, "Invalid frame rate: %d/%d\n", settings->video_fps_num, settings->video_fps_den);
return -1;
}
} break;
case 'x': {
settings->cd_speed = strtol(optarg, NULL, 0);
if (settings->cd_speed < 1 || settings->cd_speed > 2) {
fprintf(stderr, "Invalid CD-ROM speed: %d\n", settings->cd_speed);
return -1;
}
} break;
}
}
// Validate settings
switch (settings->format) {
case FORMAT_XA:
case FORMAT_XACD:
case FORMAT_STR2:
case FORMAT_STR2CD:
if (settings->frequency != PSX_AUDIO_XA_FREQ_SINGLE && settings->frequency != PSX_AUDIO_XA_FREQ_DOUBLE) {
fprintf(
stderr, "Invalid XA-ADPCM frequency: %d Hz (must be %d or %d Hz)\n", settings->frequency,
PSX_AUDIO_XA_FREQ_SINGLE, PSX_AUDIO_XA_FREQ_DOUBLE
);
return -1;
}
if (settings->channels > 2) {
fprintf(stderr, "Invalid XA-ADPCM channel count: %d (must be 1 or 2)\n", settings->channels);
return -1;
}
if (settings->loop) {
fprintf(stderr, "XA-ADPCM does not support loop markers\n");
return -1;
}
break;
case FORMAT_SPU:
case FORMAT_VAG:
if (settings->bits_per_sample != 4) {
fprintf(stderr, "Invalid SPU-ADPCM bit depth: %d (must be 4)\n", settings->bits_per_sample);
return -1;
}
if (settings->channels != 1) {
fprintf(stderr, "Invalid SPU-ADPCM channel count: %d (must be 1)\n", settings->channels);
return -1;
}
if (settings->interleave) {
fprintf(stderr, "Interleave cannot be specified for mono SPU-ADPCM\n");
return -1;
}
break;
case FORMAT_SPUI:
case FORMAT_VAGI:
if (settings->bits_per_sample != 4) {
fprintf(stderr, "Invalid SPU-ADPCM bit depth: %d (must be 4)\n", settings->bits_per_sample);
return -1;
}
if (!settings->interleave) {
fprintf(stderr, "Interleave must be specified for interleaved SPU-ADPCM\n");
return -1;
}
break;
case FORMAT_SBS2:
if (!settings->alignment) {
fprintf(stderr, "Alignment (frame size) must be specified\n");
return -1;
}
if (settings->alignment < 256) {
fprintf(stderr, "Invalid frame size: %d (must be at least 256)\n", settings->alignment);
return -1;
}
break;
default:
fprintf(stderr, "Output format must be specified\n");
return -1;
}
return optind;
}
int main(int argc, char **argv) {
settings_t settings;
int arg_offset;
FILE* output;
memset(&settings,0,sizeof(settings_t));
settings.quiet = false;
settings.show_progress = isatty(fileno(stderr));
settings.format = -1;
settings.file_number = 0;
settings.channel_number = 0;
settings.cd_speed = 2;
settings.channels = 1;
settings.frequency = PSX_AUDIO_XA_FREQ_DOUBLE;
settings.bits_per_sample = 4;
settings.interleave = 0;
settings.alignment = 2048;
settings.loop = false;
// NOTE: ffmpeg/ffplay's .str demuxer has the frame rate hardcoded to 15fps
// so if you're messing around with this make sure you test generated files
// with another player and/or in an emulator.
settings.video_width = 320;
settings.video_height = 240;
settings.video_fps_num = 15;
settings.video_fps_den = 1;
settings.ignore_aspect_ratio = false;
settings.swresample_options = NULL;
settings.swscale_options = NULL;
settings.audio_samples = NULL;
settings.audio_sample_count = 0;
settings.video_frames = NULL;
settings.video_frame_count = 0;
for(int i = 0; i < 6; i++) {
settings.state_vid.dct_block_lists[i] = NULL;
}
if (argc < 2) {
print_help();
return 1;
}
arg_offset = parse_args(&settings, argc, argv);
if (arg_offset < 0) {
return 1;
} else if (argc < arg_offset + 2) {
print_help();
return 1;
}
bool has_audio = (settings.format != FORMAT_SBS2);
bool has_video = (settings.format == FORMAT_STR2) ||
(settings.format == FORMAT_STR2CD) || (settings.format == FORMAT_SBS2);
bool did_open_data = open_av_data(argv[arg_offset + 0], &settings,
has_audio, has_video, !has_video, has_video);
if (!did_open_data) {
fprintf(stderr, "Could not open input file!\n");
return 1;
}
output = fopen(argv[arg_offset + 1], "wb");
if (output == NULL) {
fprintf(stderr, "Could not open output file!\n");
return 1;
}
settings.start_time = time(NULL);
settings.last_progress_update = 0;
switch (settings.format) {
case FORMAT_XA:
case FORMAT_XACD:
if (!settings.quiet) {
fprintf(stderr, "Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
settings.frequency, settings.bits_per_sample,
(settings.channels == 2) ? "stereo" : "mono",
settings.file_number, settings.channel_number
);
}
encode_file_xa(&settings, output);
break;
case FORMAT_SPU:
case FORMAT_VAG:
if (!settings.quiet) {
fprintf(stderr, "Audio format: SPU-ADPCM, %d Hz mono\n",
settings.frequency
);
}
encode_file_spu(&settings, output);
break;
case FORMAT_SPUI:
case FORMAT_VAGI:
if (!settings.quiet) {
fprintf(stderr, "Audio format: SPU-ADPCM, %d Hz %d channels, interleave=%d\n",
settings.frequency, settings.channels, settings.interleave
);
}
encode_file_spu_interleaved(&settings, output);
break;
case FORMAT_STR2:
case FORMAT_STR2CD:
if (!settings.quiet) {
if (settings.decoder_state_av.audio_stream) {
fprintf(stderr, "Audio format: XA-ADPCM, %d Hz %d-bit %s, F=%d C=%d\n",
settings.frequency, settings.bits_per_sample,
(settings.channels == 2) ? "stereo" : "mono",
settings.file_number, settings.channel_number
);
}
fprintf(stderr, "Video format: BS v2, %dx%d, %.2f fps\n",
settings.video_width, settings.video_height,
(double)settings.video_fps_num / (double)settings.video_fps_den
);
}
encode_file_str(&settings, output);
break;
case FORMAT_SBS2:
if (!settings.quiet) {
fprintf(stderr, "Video format: BS v2, %dx%d, %.2f fps\n",
settings.video_width, settings.video_height,
(double)settings.video_fps_num / (double)settings.video_fps_den
);
}
encode_file_sbs(&settings, output);
break;
}
if (settings.show_progress) {
fprintf(stderr, "\nDone.\n");
}
fclose(output);
close_av_data(&settings);
return 0;
}