ChipFind - документация

Электронный компонент: ARM946E-S

Скачать:  PDF   ZIP

Document Outline

Product Overview
ARM946E-S (Rev1)
System-on-Chip DSP enhanced processor
Applications
ARM DVI 0022A
Copyright ARM Limited 2000. All rights reserved. Page 1
Applications
Embedded applications
running an RTOS
Mass storage
- HDD & DVD
Speech coders
Networking applications
- G.723.1 for voice-over IP
Automotive control
- Cruise control, ABS, etc.
- Hands-free interfaces
Modems and soft-modems
Audio decoding
- Dolby AC3 digital
- MPEG MP3 audio
Speech recognition and
synthesis.
Benefits
System-on-Chip ready,
allowing rapid integration
with short time-to-market
ARM946E-S provides a
single chip DSP and
microcontroller solution
Reduced die size and chip
complexity
Fast interrupt response
Reduced programming
complexity
- No need to partition DSP
and control code
No duplication in on-chip
memory system, busing,
debug, and trace
resources.
The ARM946E-STM
The ARM946E-STM is a synthesizable macrocell combining an ARM9E-STM
processor core with instruction and data caches, tightly-coupled instruction
and data SRAM memory with protection units, write buffer, and an AMBATM
(Advanced Microprocessor Bus Architecture) AHB (Advanced High-
performance Bus) interface. It is a member of the ARM9E-S Thumb family of
high-performance 32-bit
System-on-Chip (SoC) processors, and it is well
suited to a wide range of embedded applications. The size of the instruction
and data cache, and the instruction and data SRAM is individually
configurable allowing you to tailor hardware to the embedded application. The
ARM946E-S provides a complete high-performance processor solution,
offering considerable savings in chip complexity and area, chip system
design, power consumption, and time-to-market.
Compatible with ARM7TM and StrongARM
The ARM946E-STM processor is backwards compatible with the ARM7 Thumb
Family and the StrongARM processor families, giving designers software-
compatible processors with a range of price/performance points from 60 MIPS
to 400 MIPS. Support for the ARM architecture today includes:
EPOC, JavaOS
Linux operating systems & WindowsCE
40-plus Real Time Operating Systems
Co-simulation tools from leading EDA vendors
Industry supported third-party software development tools.
DSP enhancements
The ARM946E-S processor core offers the full advantage of the enhanced
DSP capability of the ARM9E-S processor core. The ARM9E-S processor
core executes the ARM5vTE instruction set which includes new multiplier and
saturating arithmetic functions. Multiply instructions are processed faster
using a single-cycle 32x16 implementation. There are 32x16 and 16x16
multiply instructions, and the pipeline allows one multiply to start each cycle.
New saturating arithmetic improves efficiency by automatically selecting
saturated behavior during execution. Saturating arithmetic is used to set limits
on signal processing calculations to minimize the effect of noise or signal
errors. All of these instructions are beneficial for algorithms such as those
which implement GSM protocols,
Fast Fourier Transforms (FFT) and state
space servo control for HDDs.
ARM946E-S
Page 2
Copyright ARM Limited 2000. All rights reserved. ARM DVI 0022A
ARM946E-STM processor
The ARM946E-STM processor uses
the ARM9E-STM synthesizable
macrocell. This macrocell combines
the ARM9TM processor core with the
powerful features and instruction set
extensions which assist DSP
applications. The architecture of the
processor core or integer unit, is
described in more detail on page 9.
System controller
The system controller oversees the
interaction between the Instruction
Cache, Instruction RAM, Data
Cache, Data RAM, and the Bus
Interface Unit. It controls internal
arbitration between the blocks and
stalls appropriate blocks when
required.
The system controller arbitrates
between instruction and data access
to schedule single or simultaneous
requests to the cache controllers and
the Bus Interface Unit. The system
controller receives acknowledgement
from each resource to allow
execution to continue.
Control coprocessor (CP15)
The CP15 allows configuration of
both the caches and the tightly
coupled SRAMs, the write buffer, and
other ARM946E-S functions.
Several registers within CP15 are
available for program control,
providing access to features such as:
big or little-endian operation
low power state
memory partitioning and
protection
full memory BIST (Built-in Self
Test).
Protection unit
The protection unit allows memory to
be partitioned and individual
attributes set for each protection
region. Both the instruction and data
address space can be divided into
eight regions of variable size.
The protection attributes for each
region can specify the properties of
cachable, bufferable, user access,
supervisor access, etc. The
protection unit is programmed from
the CP15 registers.
Caches
Two caches are implemented, one
for instructions, the other for data,
both with an eight-word line size.
Each cache is constructed from
SRAM. The caches connect to the
ARM9E-S processor core through
32-bit buses, to allow one instruction
to be passed into the instruction
prefetch unit every cycle, and to
allow load and store multiple
instructions to transfer one register
every cycle.
Cache lock-down
Cache lock-down is provided to allow
critical code sequences to be locked
into the cache to ensure predictability
ARM9E-S
Instruction
SRAM
Data
SRAM
System control
coprocessor
(CP15)
External
coprocessor
interface
AHB
Bus Interface Unit
and write buffer
System
controller
ETM
interface
RDATA
INSTR
Instruction
cache
Memory
Protection
unit
ARM9E-S
IA
DA
WDATA
Addr
Din
Addr
Din
RDATA
INSTR
Data
cache
Din
Instruction
cache
control
Data
cache
control
ARM946E-S
ARM DVI 0022A
Copyright ARM Limited 2000. All rights reserved. Page 3
for real-time code. The cache
replacement algorithm can be
selected by the operating system as
either pseudo random or round-robin.
Both caches are four-way set-
associative. Lock down operates on a
per-set basis
Cache features
ARM946E-S instruction and data
cache sizes can be selected
independently from the following
range of cache sizes:
0KB
4KB
8KB
16KB
32KB
64KB
128KB
256KB
512KB
1MB.
The caches are four-way set
associative, with a cache line length
of eight words. Cache entries are
allocated on a read miss basis.
Write buffer
ARM946E-STM also incorporates a
16-entry write buffer, to avoid stalling
the processor when writes to external
memory are performed.
Tightly Coupled Memory
ARM946E-S supports SRAM for the
Tightly Coupled Memory (TCM).
The minimum size, when TCM is
present, is 4KB incrementing in
powers of 2 (e.g. 8KB, 16KB) up to
1MB. Therefore, the instruction and
data SRAM memories can have
unique sizes from 0KB to 1MB.
The memory is capable of returning
data to the ARM9E-S core in a single
cycle.
Address from ARM9E-S core
Address comparators
Priority
encoder
Abort attributes
Hits
Attribute
registers
Protection unit block Diagram
The ARM v5TE Architecture
Page 4
Copyright ARM Limited 2000. All rights reserved. ARM DVI 0022A
Registers
The ARM9E-STM processor core
consists of a 32-bit datapath and
associated control logic. That
datapath contains 31 general-
purpose registers, coupled to a full
shifter, Arithmetic Logic Unit, and
multiplier. At any one time 16
registers are visible to the user. The
remainder are synonyms used to
speed up exception processing.
Register 15 is the
Program Counter
(PC) and can be used in all
instructions to reference data relative
to the current instruction. R14 holds
the return address after a subroutine
call. R13 is used (by software
convention) as a stack pointer.
Modes and exception
handling
All exceptions have banked registers
for R14 and R13. After an exception,
R14 holds the return address for
exception processing. This address
is used both to return after the
exception is processed and to
address the instruction that caused
the exception. R13 is banked across
exception modes to provide each
exception handler with a private
stack pointer. The fast interrupt mode
also banks registers eight to 12 so
that interrupt processing can begin
without the need to save or restore
these registers. A seventh
processing mode, System mode,
does not have any banked registers.
It uses the User mode registers.
System mode runs tasks that require
a privileged processor mode and
allows them to invoke all classes of
exceptions.
Status registers
All other processor states are held in
status registers. The current
operating processor status is in the
Current Program Status Register
(CPSR). The CPSR holds:
four ALU flags (Negative, Zero,
Carry, and Overflow),
two interrupt disable bits (one for
each type of interrupt),
a bit to indicate ARM or Thumb
execution,
and five bits to encode the
current processor mode.
All five exception modes also have a
Saved Program Status Register
(SPSR) which holds the CPSR of the
task immediately before the
exception occurred.
Exception types
ARM9E-S supports five types of
exception, and a privileged
processing mode for each type. The
types of exceptions are:
fast interrupt (FIQ)
normal interrupt (IRQ)
memory aborts (used to
implement memory protection or
virtual memory)
attempted execution of an
undefined instruction
software interrupts (SWIs).
Conditional execution
All ARM instructions (with the
exception of BLX) are conditionally
executed. Instructions optionally
update the four condition code flags
(Negative, Zero, Carry, and
Overflow) according to their result.
Subsequent instructions are
conditionally executed according to
the status of flags. Fifteen conditions
are implemented.
Four classes of
instructions
The ARM and Thumb instruction sets
can be divided into four broad
classes of instruction:
data processing instructions
load and store instructions
branch instructions
coprocessor instructions.
Data processing
The data processing instructions
operate on data held in general
purpose registers. Of the two source
operands, one is always a register.
The other has two basic forms:
an immediate value
a register value optionally
shifted.
If the operand is a shifted register the
shift amount might have an
immediate value or the value of
another register. Four types of shift
can be specified. Most data
processing instructions can perform
a shift followed by a logical or
arithmetic operation. Multiply
instructions come in two classes:
normal - 32-bit result
long - 32-bit result variants.
Both types of multiply instruction can
optionally perform an accumulate
operation.
Load and store
The second class of instruction is
load and store instructions. These
instructions come in two main types:
load or store the value of a single
register or register pair
load and store multiple register
values.
Load and store single register
instructions can transfer a 32-bit
word, a 16-bit halfword and an
The ARM v5TE Architecture
ARM DVI 0022A
Copyright ARM Limited 2000. All rights reserved. Page 5
eight-bit byte between memory and a
register. Byte and halfword loads
may be automatically zero extended
or sign extended as they are loaded.
A preload `hint' instruction is
available to help minimize memory
system latency. Swap instructions
perform an atomic load and store as
a synchronization primitive.
Addressing modes
Load and store instructions have
three primary addressing modes
offset
pre-indexed
post-indexed.
They are formed by adding or
subtracting an immediate or register-
based offset to or from a base
register. Register-based offsets can
also be scaled with shift operations.
Pre-indexed and post-indexed
addressing modes update the base
register with the base plus offset
calculation. As the PC is a general
purpose register, a 32-bit value can
be loaded directly into the PC to
perform a jump to any address in the
4GB memory space.
Block transfers
Load and store multiple instructions
perform a block transfer of any
number of the general purpose
registers to or from memory. Four
addressing modes are provided:
pre-increment addressing
post-increment addressing
pre-decrement addressing
post-decrement addressing.
The base address is specified by a
register value (which can be
optionally updated after the transfer).
As the subroutine return address and
the PC values are in general purpose
registers, very efficient subroutine
calls can be constructed.
Branch
As well as allowing any data
processing or load instruction to
change control flow (by writing the
PC) a standard branch instruction is
provided with 24-bit signed offset,
allowing forward and backward
branches of up to 32MB.
Branch with Link
There is a Branch with Link (BL)
which allows efficient subroutine
calls. BL preserves the address of
the instruction after the branch in
R14 (the Link Register or LR). This
allows a move instruction to put the
LR in to the PC and return to the
instruction after the branch.
The third type of branch (BX and
BLX) switches between ARM and
Thumb instruction sets optionally
with the return address preserving
link option.
Coprocessor
There are three types of coprocessor
instructions:
coprocessor data processing
instructions are used to invoke a
coprocessor specific internal
operation.
coprocessor register transfer
instructions allow a coprocessor
value (word or double word) to
be transferred to or from an ARM
register (or register pair).
coprocessor data transfer
instructions transfer coprocessor
data to or from memory, where
the ARM calculates the address
of the transfer.