Contents

Install julius on Raspberry Pi

Environment

1
2
3
4
5
6
7
8
9
$ uname -a
Linux rpi4 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr  3 17:24:16 BST 2023 aarch64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 11 (bullseye)
Release:	11
Codename:	bullseye

Installation

Install dependencies.

1
sudo apt-get install build-essential zlib1g-dev libsdl2-dev libasound2-dev

In your any directory on Raspberry pi, make a directory (in my case, julius-speech)

1
mkdir julius-speech

Clone julius and Japanese dictation kit from git repositories.

1
2
git clone git@github.com:julius-speech/julius.git
git clone git@github.com:julius-speech/dictation-kit.git

Build the julius. Note that the architecture of my Rpi4 is aarch64, which you can check by arch command.

1
2
3
4
cd julius/
./configure --with-mictype=alsa --build=aarch64
make -j4
sudo make install

Specify sound card

1
2
3
4
5
$ arecord -l
**** ハードウェアデバイス CAPTURE のリスト ****
カード 0: Device [USB PnP Audio Device], デバイス 0: USB Audio [USB Audio]
  サブデバイス: 1/1
  サブデバイス #0: subdevice #0

This indicates card number:0 and device: 0

Export environment variable

1
export ALSADEV=plughw:0,0

You can add the export step in ~/profile

Note that setting hw:0,0 (instead of plughw:0,0) did not work for dictation. This may known issue in Alsa’s bitrate handling.

I had to use plughw:0,0 instead because that allowed ALSA to make the conversion. hw:0,0 sends the audio straight to the DAC without any conversion. No other mpd.conf configuration is necessary besides specifying hw:0,0.

In detail, refer:

When using hw:0,0, at the end of booting message of julius says:

1
2
3
4
5
6
7
8
9
Stat: adin_alsa: device name from ALSADEV: "hw:0,0"
Warning: adin_alsa: the exact rate 16000 Hz is not available by your PCM hardware.
Warning: adin_alsa: using 48000 Hz instead.
Stat: capture audio at 48000Hz
Stat: adin_alsa: latency set to 32 msec (chunk = 1536 bytes)
Stat: "hw:0,0": Device [USB PnP Audio Device] device USB Audio [USB Audio] subdevice #0
STAT: AD-in thread created
pass1_best:  うん 、 何           
sentence1:  む 、 ラブ 。

Using plughw:0,0,

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Stat: adin_alsa: device name from ALSADEV: "plughw:0,0"
Stat: capture audio at 16000Hz
Stat: adin_alsa: latency set to 32 msec (chunk = 512 bytes)
Stat: "plughw:0,0": Device [USB PnP Audio Device] device USB Audio [USB Audio] subdevice #0
STAT: AD-in thread created
<<< please speak >>>Warning: strip: sample 0-17 has zero value, stripped

pass1_best: <input rejected by short input>
pass1_best:  こんにちは           
sentence1:  こんにちは 。

Run a demo for dictation kit.

1
2
cd ../dictation-kit
julius -C main.jconf -C am-gmm.jconf -demo

References