For docker based builds, please refer Docker Compilation Steps
-
Download and Install Emscripten using following instructions (skip this step if emsdk tool chain is already installed)
- Get the sdk:
git clone https://github.com/emscripten-core/emsdk.git - Enter the cloned directory:
cd emsdk - Install the sdk tools:
./emsdk install 3.1.8 - Activate the sdk tools:
./emsdk activate 3.1.8 - Activate path variables:
source ./emsdk_env.sh
EMSDKenvironment variable will point to the valid emsdk repo after executing the instructions above. - Get the sdk:
-
Compile
-
Create a build directory (e.g.
build-wasm) and run compilation commands inside it as below:mkdir build-wasm; cd build-wasm emcmake cmake -DCOMPILE_CUDA=off -DUSE_STATIC_LIBS=on -DUSE_DOXYGEN=off -DUSE_FBGEMM=off -DUSE_MKL=off -DUSE_NCCL=off -DUSE_WASM_COMPATIBLE_SOURCE=on -DCOMPILE_WASM=on ../ emmake make -j
The artifacts (.js and .wasm files) will be available in
build-wasmfolder.
-
-
Pre-processing (Package files to WASM-compiled runtime)
This step is required to be able to perform translation using wasm binary.
The script
package-benchmark.shinsidewasmfolder downloads and packages the Bergamot project specific Spanish to English translation models, vocabulary, lexical shortlist files and a News test file as source text for translation. (Please installsacrebleubefore if not installed already using command:pip install sacrebleu).From the build directory (
build-wasmfor local builds orbuild-wasm-dockerfor docker builds), run:bash ../wasm/package-benchmark.sh
-
Perform Translation
-
Launch the emscripten-generated HTML page in a web browser using following commands:
emrun --no_browser --port 8000 . -
Open up following link in Firefox nightly browser (replace
stdin-inputwith the text that you want to translate andcommand-line-argswith the appropriate model, vocabulary files etc.)http://localhost:8000/marian-decoder.html?stdinInput=<stdin-input>&arguments=<command-line-args>
e.g. To translate "Hola mundo" to English using the Bergamot project specific Spanish to English files (packaged above), open this link:
http://localhost:8000/marian-decoder.html?stdinInput=Hola mundo&arguments=-m /model.npz -v /vocab.esen.spm /vocab.esen.spm --cpu-threads 1
Note: To run in Chrome, launch Chrome with
--js-flags="--experimental-wasm-simd", eg:/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --js-flags="--experimental-wasm-simd"
Please remember that the Developer Tools must not be open when opening the links or refreshing the page to run the benchmark again.
-
Open the Developer Tools and you should see the result in console.
-
If you used the script package-benchmark.sh mentioned above then open following for benchmarking (please remember that the Developer Tools must not be open when opening the links or refreshing the page to run the benchmark again.)
-
float32
http://localhost:8000/marian-decoder.html?arguments=-m /model.npz -v /vocab.esen.spm /vocab.esen.spm -i /newstest2013.es.top300lines --beam-size 1 --mini-batch 32 --maxi-batch 100 --maxi-batch-sort src -w 128 --skip-cost --shortlist /lex.s2t 50 50 --cpu-threads 1 -
intgemm8
http://localhost:8000/marian-decoder.html?arguments=-m /model.npz -v /vocab.esen.spm /vocab.esen.spm -i /newstest2013.es.top300lines --beam-size 1 --mini-batch 32 --maxi-batch 100 --maxi-batch-sort src -w 128 --skip-cost --shortlist /lex.s2t 50 50 --cpu-threads 1 --int8shift -
intgemm8 with binary model file
http://localhost:8000/marian-decoder.html?arguments=-m /model.intgemm.bin -v /vocab.esen.spm /vocab.esen.spm -i /newstest2013.es.top300lines --beam-size 1 --mini-batch 32 --maxi-batch 100 --maxi-batch-sort src -w 128 --skip-cost --shortlist /lex.s2t 50 50 --cpu-threads 1 --int8shift -
intgemm8alphas
http://localhost:8000/marian-decoder.html?arguments=-m /model.npz -v /vocab.esen.spm /vocab.esen.spm -i /newstest2013.es.top300lines --beam-size 1 --mini-batch 32 --maxi-batch 100 --maxi-batch-sort src -w 128 --skip-cost --shortlist /lex.s2t 50 50 --cpu-threads 1 --int8shiftAlphaAll -
intgemm8alphas with binary model file
http://localhost:8000/marian-decoder.html?arguments=-m /model.intgemm.alphas.bin -v /vocab.esen.spm /vocab.esen.spm -i /newstest2013.es.top300lines --beam-size 1 --mini-batch 32 --maxi-batch 100 --maxi-batch-sort src -w 128 --skip-cost --shortlist /lex.s2t 50 50 --cpu-threads 1 --int8shiftAlphaAll
Alternatively, wasm marian-decoder can also be compiled on docker.
-
Prepare docker image for WASM compilation This step is required only for the first time (or after any changes to docker image)
make wasm-image
-
Compile to wasm
make compile-wasm-docker
The artifacts (.js and .wasm files) will be available in
build-wasm-dockerfolder in root of this repository. -
Performing Translation
Please follow Performing Translation for this.
-
Clean build (to start next compilation from scratch)
make clean-wasm-docker
-
Compile and run a wasm stdin test:
make compile-and-run-stdin-test-wasm open "http://localhost:8009/compile-test-stdin-wasm.html" -
Enter a docker container shell for manually running commands:
make wasm-shell