Allwinner Technology A40i Domestic Development Board – Comprehensive Test of Performance Parameters
The board under test is an Allwinner Technology A40i based development board with rich interface resources, which can lead to dual network ports, dual CAN, dual USB, dual RS485 and other communication interfaces, Bluetooth, WIFI, 4G (optional) modules on board, as well as MIPI LCD, LVDS LCD, TFT LCD, HDMI OUT, CVBS OUT, CAMERA, LINE IN, H/P OUT and other audio and video multimedia interfaces, support dual-screen display, 1080P@45fps H.264 video hardware encoding, 1080P@60fps H.264 video hardware decoding, and support SATA large-capacity storage interface.
The following is the content of the evaluation prepared by the user of the evaluation, welcome to read.
Preface
The development environment was previously experienced, and now a qualitative experience of all aspects of performance.
Running Score
Open WSL terminal
Download the code
git clone https://github.com/eembc/coremark.git
cd coremark/
vi simple/core_portme.h
Modification
#define COMPILER_FLAGS \
FLAGS_STR /* “Please put compiler flags here (e.g. -o3)” */
#endif
为
#define COMPILER_FLAGS \
“-O3” /* “Please put compiler flags here (e.g. -o3)” */
#endif
If -O0 compiles then change to “-O0”
typedef ee_u32 ee_ptr_int;
Change to
typedef unsigned long ee_ptr_int。
Compilation
export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin
arm-linux-gnueabihf-gcc -o coremarko0 core_list_join.c core_main.c core_matrix.c core_state.c core_util.c simple/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=100000 -Isimple -I. -O0
arm-linux-gnueabihf-gcc -o coremarko3 core_list_join.c core_main.c core_matrix.c core_state.c core_util.c simple/core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=100000 -Isimple -I. -O3
Importing to windows
cp coremarko0 coremarko3 /mnt/d
Importing to windows
chmod +x coremarko0 coremarko3
Run
./coremarko0
./coremarko3
The results are as follows, you can see that the optimization difference is large
root@T3/A40i-Tronlong:~# ./coremarko0
2K performance run parameters for coremark.
CoreMark Size : 666
Total ticks : 146952831
Total time (secs): 146.952831
Iterations/Sec : 680.490463
Iterations : 100000
Compiler version : GCC9.4.0
Compiler flags : -O0
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0xd340
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 680.490463 / GCC9.4.0 -O0 / STACK
root@T3/A40i-Tronlong:~# ./coremarko3
2K performance run parameters for coremark.
CoreMark Size : 666
Total ticks : 29362505
Total time (secs): 29.362505
Iterations/Sec : 3405.703975
Iterations : 100000
Compiler version : GCC9.4.0
Compiler flags : -O0
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0xd340
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 3405.703975 / GCC9.4.0 -O0 / STACK
从https://www.eembc.org/coremark/scores.php
Search for Cortex-A7 to compare the scores of the same CPU models
Cortex – A7 1.2GHz
RAM Performance Test
WSL
git clone https://github.com/qinyunti/STREAM.git
cd STREAM/
export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin
arm-linux-gnueabihf-gcc -O3 -DSTREAM_ARRAY_SIZE=5000000 stream.c -o stream.5M
cp stream.5M /mnt/d
chmod +x stream.5M
./stream.5M
root@T3/A40i-Tronlong:~# ./stream.5M
————————————————————-
STREAM version $Revision: 5.10 $
————————————————————-
This system uses 8 bytes per array element.
————————————————————-
Array size = 5000000 (elements), Offset = 0 (elements)
Memory per array = 38.1 MiB (= 0.0 GiB).
Total memory required = 114.4 MiB (= 0.1 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
————————————————————-
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 52219 microseconds.
(= 52219 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
————————————————————-
WARNING — The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
————————————————————-
Function Best Rate MB/s Avg time Min time Max time
Copy: 972.1 0.083436 0.082297 0.084256
Scale: 868.5 0.092398 0.092110 0.092609
Add: 829.7 0.144716 0.144639 0.144788
Triad: 683.4 0.175755 0.175587 0.175917
————————————————————-
Solution Validates: avg error less than 1.000000e-13 on all three arrays
Refer to https://www.cs.virginia.edu/stream/ref.html
RAM stress test
Refer to https://pyropus.ca./software/memtester/
wget https://pyropus.ca./software/memtester/old-versions/memtester-4.5.1.tar.gz
tar -xvf memtester-4.5.1.tar.gz
cd memtester-4.5.1/
export PATH=$PATH:~/lichee/out/sun8iw11p1/linux/common/buildroot/host/usr/bin
arm-linux-gnueabihf-gcc -O3 memtester.c tests.c -o memtester
Export to WINDOWS, download to development board
cp memtester /mnt/d
chmod +x memtester
root@T3/A40i-Tronlong:~# ./memtester 128M 1
memtester version 4.5.1 (32-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffff000
want 128MB (134217728 bytes)
got 128MB (134217728 bytes), trying mlock …locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
Done.
EMMC Performance Testing
dmesg | grep mmc
4GEMMC
[ 4.008550] mmc0: new HS200 MMC card at address 0001
[ 4.009409] mmcblk0: mmc0:0001 S04111 3.56 GiB
[ 8.202017] mmc1: new high speed SDHC card at address aaaa
[ 8.208872] mmcblk1: mmc1:aaaa SL16G 14.8 GiB
EMMC speed for HS200
root@T3/A40i-Tronlong:~# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 2029971 514680 1406338 27% /
devtmpfs 107996 0 107996 0% /dev
tmpfs 124604 0 124604 0% /dev/shm
tmpfs 124604 8 124596 0% /tmp
tmpfs 124604 12 124592 0% /run
cgroup 124604 0 124604 0% /sys/fs/cgroup
root@T3/A40i-Tronlong:~#
No SD card inserted / Mounted in emmc
root@T3/A40i-Tronlong:/# time dd if=/dev/zero of=/test.bin bs=16k count=65536
65536+0 records in
65536+0 records out
real 0m37.581s
user 0m0.080s
sys 0m15.230s
root@T3/A40i-Tronlong:/# time dd if=test.bin of=/dev/null bs=16k count=65536
65536+0 records in
65536+0 records out
real 0m10.386s
user 0m0.070s
sys 0m4.040s
root@T3/A40i-Tronlong:/#
SD Card Performance Test
Insert SD card and reboot, automatically hang on to/root to SD card
root@T3/A40i-Tronlong:~# time dd if=/dev/zero of=/root/test.bin bs=16k count=65536
65536+0 records in
65536+0 records out
real 1m32.412s
user 0m0.330s
sys 0m17.700s
root@T3/A40i-Tronlong:~# time dd if=/root/test.bin of=/dev/null bs=16k count=65536
65536+0 records in
65536+0 records out
real 0m48.177s
user 0m0.100s
sys 0m4.350s
Insert SD card and reboot, automatically hang on to/root to SD card
Summary
The above comprehensive performance test, feel that the performance is still very good, the test results are for reference only, because the environment and other factors are not the same measured results will be different, including the storage test method is not very scientific, such as not taking into account the cache. The above test is only a qualitative performance experience, the performance of the board is a comprehensive experience, need to be in the face of real application scenarios to be meaningful, and optimization for the scene is also very important.
Related posts:
- Jambulon FORESEE Launches Next-Generation UFS 3.1 Flagship High-Speed Flash Memory, Making a Leap in Mobile Performance
- New update of MXeval, a performance evaluation tool for autonomous driving assistance systems
- What is the resonant circuit of LC and what are the functions of LC resonant circuit?
- [MM32F5270] Keil development environment build