Welcome: Hunan Intelligent Applications Tecgnology CO.,ltd.-HNIAT.com
Language: Chinese ∷  English

Basic knowledge

Detailed explanation of arm's MMU (virtual address)

One, the generation of MMU

Many years ago, when people were still using DOS or older operating systems, the computer's memory was still very small, and calculations were generally performed in K units. Correspondingly, the scale of programs at that time was not large, so the memory Although the capacity is small, it can still accommodate the program at the time. However, with the rise of graphical interfaces and the continuous increase in user needs, the scale of applications has also expanded. Finally, a problem has appeared in front of programmers, that is, the application is too large to accommodate the program in memory. The usual solution is to divide the program into many fragments called overlays. Cover block 0 runs first, and at the end it will call another cover block. Although the exchange of the cover block is done by the OS, the programmer must first divide the program. This is a time-consuming and laborious task, and it is quite boring. People must find a better way to solve this problem fundamentally. Soon people found a way, which is virtual memory (virtual memory). The basic idea of virtual memory is that the total size of program, data, and stack can exceed the size of physical memory. The operating system keeps the currently used part in the memory. And save other unused parts on the disk. For example, for a 16MB program and a machine with only 4MB of memory, the OS can decide which 4M of content to keep in the memory at all times and in the memory when needed. Exchange program fragments with the disk, so that the 16M program can be run on a machine with only 4M memory. And this 16M program does not need to be divided by the programmer before running.

At any time, there is a set of addresses that can be generated by a program on the computer, which we call the address range. The size of this range is determined by the number of bits of the CPU. For example, for a 32-bit CPU, its address range is 0~0xFFFFFFFF (4G). For a 64-bit CPU, its address range is 0~0xFFFFFFFFFFFFFFFF (64T). The range is the address range that our program can generate. We call this address range a virtual address space, and we call a certain address in this space a virtual address. Corresponding to virtual address space and virtual address are physical address space and physical address. Most of the time, the physical address space of our system is only a subset of the virtual address space. Here is a simple example to illustrate this intuitively. Both, for a 32bit x86 host with 256MB of memory, its virtual address space range is 0~0xFFFFFFFF (4G), and the physical address space range is 0x000000000~0x0FFFFFFF (256MB).

On a machine that does not use virtual memory, the virtual address is sent directly to the memory bus, so that the physical memory with the same address can be read and written. In the case of using virtual memory, the virtual address is not sent directly to the memory address bus, but to the memory management unit-MMU (the protagonist finally appears). It is composed of one or a group of chips, and generally exists in a coprocessor, and its function is to map a virtual address to a physical address.

 

2. MMU working process

Most systems that use virtual memory use a type called paging. The virtual address space is divided into units called pages, and the corresponding physical address space is also divided into frames. The size of pages and page frames must be the same. Next, in conjunction with the picture, I use an example to illustrate how the page and the page frame are mapped under the scheduling of the MMU:



In this example, we have a machine that can generate 16-bit addresses. Its virtual address ranges from 0x0000 to 0xFFFF (64K), and this machine has only 32K physical addresses, so it can run 64K programs, but this program cannot The sex is transferred into the memory to run. This machine must have an external memory (such as a disk or FLASH) that can store 64K programs to ensure that the program fragments can be called when needed. In this example, the page size is 4K, and the page frame size is the same as the page (this must be guaranteed, the transfer between memory and peripheral memory is always in page units), corresponding to 64K virtual addresses and 32K Physical memory, they contain 16 pages and 8 page frames respectively.

Let's first explain a few terms that will be used after paging according to the above picture. We have already touched the page and page frame above. The green part in the above picture is the physical space, and each grid represents a physical page frame. The orange part is the virtual space, and each grid represents a page. It consists of two parts, Frame Index (page frame index) and bit p (present bit). The meaning of Frame Index is obvious. It indicates that this page is To which physical page frame is mapped, the meaning of bit p is to indicate whether the mapping on this page is valid. As shown in the figure above, when a page is not mapped (or the mapping is invalid, the Frame Index part is X), this bit If it is 0, the bit is 1 if the mapping is valid.

We execute the following commands (the commands in this example are not for any specific model, they are pseudo commands)

  example 1:

MOVE REG, 0 //Transfer the value of address 0 into the register REG.

The virtual address 0 will be sent to the MMU. The MMU sees that the virtual address falls in the range of page 0 (the range of page 0 is 0 to 4095). From the above figure, we can see that the page frame corresponding to page 0 (mapping) is 2 ( The address range of page frame 2 is 8192 to 12287), so the MMU converts the virtual address into a physical address 8192 and sends the address 8192 to the address bus. The memory does not know anything about the MMU mapping, it only sees a read request to address 8192 and executes it. The MMU thus maps virtual addresses from 0 to 4096 to physical addresses from 8192 to 12287.

Example 2:

MOVE REG, 8192

is converted to

MOVE REG,24576

Because the virtual address 8192 is in page 2, and page 2 is mapped to page frame 6 (physical address is from 24576 to 28671)

Example 3:

MOVE REG, 20500

is converted to

MOVE REG,12308

The virtual address 20500 is 20 bytes away from the beginning of virtual page 5 (the virtual address range is 20480 to 24575). Virtual page 5 is mapped to page frame 3 (the address range of page frame 3 is 12288 to 16383), so it is mapped to physical Address 12288+20=12308.

By properly setting the MMU, 16 virtual pages can be mapped to any one of the 8 page frames, but this method does not effectively solve the problem that the virtual address space is larger than the physical address space. As we can see from the above figure, we only have 8 page frames (physical addresses), but we have 16 pages (virtual addresses), so we can only effectively map 8 of the 16 pages. Let’s see what happens in Example 4

MOV REG,32780

The virtual address 32780 falls within the range of page 8. From the above figure, we can see that page 8 is not effectively mapped (the page is marked with an X). What will happen? The MMU notices that this page is not mapped, so it notifies the CPU that a page fault has occurred. In this case, the operating system must deal with this page fault, and it must find one of the eight physical page frames that is currently rarely used. Use the page frame and write the content of the page frame into the peripheral memory (this action is called page copy), and then map the page that needs to be referenced (page 8 in example 4) to the page frame just released (this action) Called to modify the mapping relationship), and then re-execute the faulty instruction (MOV REG, 32780). Assuming that the operating system decides to release page frame 1, then it will load virtual page 8 into 4-8K of the physical address, and make two changes: First, mark virtual page 1 as unmapped (the original virtual page 1 is mapped to page Box 1), so that any subsequent access to virtual addresses 4K to 8K will cause page faults and cause the operating system to take appropriate actions (this action is what we are discussing now), and secondly, he corresponds to virtual page 8 The page frame number is changed from X to 1, so when MOV REG is executed again, 32780 will be mapped to 4108 by MMU.

We have roughly understood what role the MMU plays in our machine and what its basic work content is. Below we will give an example to illustrate how it works (note that the MMU in this example is not aimed at a specific Model, it is an abstraction of all MMU work).

First of all, make it clear that the main job of MMU is only one, which is to map virtual addresses to physical addresses.

We already know that most systems that use virtual memory use a technique called paging. Just like the example we just gave, the virtual address space is divided into a set of pages of the same size, with one for each page. Used to mark its page number (this page number is generally its index in the group, which is similar to an array in C/C++). In the above example, the page number of 0~4K is 0, the page number of 4~8K is 1, the page number of 8~12K is 2, and so on. The virtual address (note: it is a certain address, not a space) is divided into 2 parts by the MMU, part is the page index (page Index), and the second part is the offset relative to the top address of the page (offset ).. Let's take the 16-bit machine just now and the following figure for an example. In this example, the virtual address 8196 is sent to the MMU, and the MMU maps it to a physical address. The total address range that a 16-bit CPU can generate is 0~64K. According to the size of 4K per page, the space must be divided into 16 pages. And the range that our virtual address part can express must also be equal to 16 (so that each page in the page group can be indexed), which means that this part needs at least 4 bits. The size of a page is 4K (4096), which means that the offset part must be represented by 12 bits (2^12=4096, so that all addresses in a page can be accessed), the binary code of 8196 is shown in the figure below :



The page number index of this address is 0010 (binary code), that is, the indexed page is page 2, the second part is 000000000100 (binary code), and the offset is 4. The page frame number in page 2 is 6 (page 2 is mapped to page frame 6, see the figure above), and we see that the physical address of page frame 6 is 24~28K. So the MMU calculates that the virtual address 8196 should be mapped to the physical address 24580 (the first address of the page frame + offset = 24576 + 4 = 24580). Similarly, if we read the virtual address 1026, the binary code of 1026 is 0000010000000010, page index="0000"=0, offset=010000000010=1026. The page number is 0, the page frame number mapped to this page is 2, and the physical address range of page frame 2 is 8192~12287, so the MMU maps the virtual address 1026 to the physical address 9218 (page frame first address + offset = 8192+ 1026=9218). The above is the working process of MMU.

Three, S3C24XX MMU working process

Below we will explain to the MMU (Note 1) of s3c2410.

S3c2410 has 4 memory mapping methods in total, namely:

  1. Fault (no mapping)

  2. Coarse Page (Rough Table)

3. Section

4. Fine Page

We use Section (paragraph) to explain.

ARM920T is a 32bit CPU, and its virtual address space is 2^32=4G. In Section mode, the 4G virtual space is divided into a unit called Section (which is essentially the same as the page we talked about above), and the length of each section is 1M (and what we did before The length of the page used is 4K). The 4G virtual memory can be divided into 4096 segments in total (1M*4096=4G), so we must use 4096 descriptors to describe this group of segments. Each descriptor occupies 4 Bytes, so this group of descriptors The size is 16KB (4byte*4096). These 4096 descriptors form a table, which we call Tralaton Table.



    The figure above is the structure of the descriptor
    Section base address: section base address (equivalent to the first address of the page frame number)
    AP: Access Permission
    Domain: Index of access control register. Domain is used in conjunction with AP to check access permissions
    C: Write-through (WT) mode when C is set to 1
    B: When B is set to 1, it is in write-back (WB) mode (only one of the two bits C and B can be set to 1 at the same time)
    The following is a schematic diagram of s3c2410 after memory mapping:



The size of the SDRSAM configured on my s3c2410 is 64M. The physical address range of the SDRAM is 0x3000 0000~0x33FF FFFF (belonging to Bank 6). Since the size of 1 Section is 1M, the physical space can be divided into 64 physical segments ( Page frame).

In Section mode, the virtual address (Note 1) sent to the MMU is divided into two parts (this is the same as the example we gave above), the two parts are Descriptor Index (equivalent to Page Index in the above example) and Offset, the descriptor index length is 12bit (2^12=4096, what can you see from this relation?:) ), the offset length is 20bit (2^20=1M, what can you see again?:)). Observe the Section Base Address part of a descriptor (Descriptor). Its length is 12 bits. The value inside is the first 12 bits of the physical address of the physical segment (page frame) mapped to the virtual segment (page). The length of the segment is 1M, so the last 20 bits of the first address of the physical segment are always 0x00000 (every
Each Section is aligned with 1M), the method to determine a physical address is the physical page frame base address + the offset part of the virtual address = Section Base Address<<20+Offset, hehe, maybe you are a bit confused, or give one Let's illustrate with practical examples.

Assuming that the instruction MOV REG, 0x30000012 is executed now, the binary code of the virtual address is 00110000 00000000 00000000 00010010, and the first 12 bits are Descriptor Index= 00110000 0000=768, so the No. 768 descriptor is found in the Translation Table, the Section Base Address of this description = "0x0300", that is to say, the first address of the physical segment (page frame) mapped by the virtual segment (page) described by the descriptor is 0x3000 0000 (the base address of the physical segment (page frame) = Section Base Address left 20bit =0x0300<<20=0x3000 0000), and Offset=000000 00000000 00010010=0x12, so the physical address mapped from the virtual address 0x30000012=0x3000 0000+0x12=0x3000 0012 (physical page frame base address + offset in the virtual address) . You may ask why this virtual address is the same as the mapped physical address? This is determined by the mapping rules we define. In this example, the mapping rule we defined is to map a virtual address to a physical address equal to it. We write the code of the mapping relationship like this:

Void mem_mapping_linear(void)

{

Unsigned long descriptor_index, section_base, sdram_base, sdram_size;

Sdram_base=0x30000000;

Sdram_size=0x 4000000;

For (section _base = sdram_base,descriptor_index = section _base>>20;

Section _base <sdram_base+ sdram_size;

Descriptor_index+=1;section _base +=0x100000)

{

*(Mmu_tlb_base + (descriptor_index)) = (section _base>>20) "MMU_OTHER_SECDESC;

}

}

The above piece of code maps the virtual space 0x3000 0000~0x33FF FFFF to the physical space 0x3000 0000~0x33FF FFFF. Since the virtual space is consistent with the physical space, the virtual address and their respective physical addresses are consistent in value. After initializing the Translation Table, remember to load the first address of the Translation Table (the address of the No. 0 descriptor) into the Control Register 2 (No. 2 control register) of the coprocessor CP15. The name of the control register is called Translation table base. (TTB) register.

The above discussion is the Section Base Address in the descriptor and the mapping relationship between the virtual address and the physical address. However, MMU also has an important function, that is, the access control mechanism (Access Permission). Simply put, the access control mechanism is that the CPU uses a certain method to determine whether the current program's access to the memory is legal (whether it has permission to access the memory). If the current program does not have the permission to operate on the memory area to be accessed, the CPU will An exception is raised. s3c2410 calls the exception Permission fault. The x86 architecture calls this exception General Protection. What conditions can cause Permission fault? For example, a User-level program needs to write to a System-level memory area. This operation is unauthorized and should cause a Permission fault. Friends who have worked in the x86 architecture should have heard of Protection Mode. The protection mode is Based on this idea, we can also say: The access control mechanism of s3c2410 is actually a protection mechanism. So what elements are involved in the access control mechanism of s3c2410? How do they coordinate their work? In total, these elements are:

  1. Control Register3 in the coprocessor CP15: DOMAIN ACCESS CONTROL REGISTER

  2. AP bit and Domain bit in the segment descriptor

3. S bit and R bit in Control Register1 (control register 1) in the coprocessor CP15

4. Control Register5 in the coprocessor CP15 (control register 5)

5. Control Register 6 in the coprocessor CP15 (Control Register 6)

DOMAIN ACCESS CONTROL REGISTER is an access control register. The effective bit of this register is 32. It is divided into 16 areas. Each area consists of two bits. They indicate the current memory access permission check level, as shown in the following figure:



   There are 4 values that can be filled in each area, respectively 00, 01, 10, 11 (binary), and their meanings are as follows:



    00: At the current level, the memory area is not allowed to be accessed, any access will cause a domain fault
    01: At the current level, the access to the memory area must be checked with the AP bit in the segment descriptor of the memory area
    10: Keep status (we don’t fill in this value, so as not to cause uncertain problems)
    11: At the current level, no permission check is performed on access to this memory area.
Let's take a look at the Domain area in the discriptor. The area has a total of 4 bits, and the value inside is the index to the 16 areas in the DOMAIN ACCESS CONTROL REGISTER. The AP bit cooperates with the S bit and A bit to describe the memory of the current descriptor. The description of the access rights of the area, and their cooperation relationship is shown in the following figure:




      The AP bit also has four values, which I will illustrate with examples.
In the following example, our DOMAIN ACCESS CONTROL REGISTER is initialized to 0xFFFF BDCF, as shown in the following figure:



  example 1:

domain=4, AP=10 in Discriptor (in this case, S bit and A bit are ignored)

Suppose now I want to access the memory area described by the descriptor:

Since domain=4, and the value of field 4 in DOMAIN ACCESS CONTROL REGISTER is 01, the system will check the access authority for this access.

Assuming that the current CPU is in Supervisor mode, the program can read and write the memory area described by the descriptor.

Assuming that the current CPU is in User mode, the program can read and access the memory described by the descriptor. If it is written to, it will cause a permission fault.

Example 2:

domain=0, AP=10 in Discriptor (in this case, S bit and A bit are ignored)

Domain=0, and the value of field 0 in DOMAIN ACCESS CONTROL REGISTER is 11. The system does not check the access authority for any memory area access.

Since the system does not check the access authority for access to any memory area, no matter the CPU is in the combined mode (Supervisor mode or User mode), the program can smoothly read and write the memory described by the descriptor.

Example 3: Domain=4, AP=11 in Discriptor (in this case, S bit and A bit are ignored)

Since domain=4, and the value of field 4 in DOMAIN ACCESS CONTROL REGISTER is 01, the system will check the access authority for this access.

Since AP=11, no matter if the CPU is in the combined mode (Supervisor mode or User mode), the memory described by the descriptor can be read and written smoothly by the program

Example 4:

domain=4,AP=00, S bit="0",A bit="0" in Discriptor

Since domain=4, and the value of field 4 in DOMAIN ACCESS CONTROL REGISTER is 01, the system will check the access authority for this access.

Because AP=00, S bit="0", A bit="0", no matter the CPU is in the combined mode (Supervisor mode or User mode), the program can only read the memory described by this descriptor , Otherwise it will cause permission fault.

Through the above 4 examples, we draw two conclusions:

  1. Whether a permission check is required for access to a certain memory area is determined by the Domain field in the descriptor of the memory area.

  2. The access authority of a certain memory area is determined by the AP bit in the descriptor of the memory area and the S bit and R bit in the Control Register 1 (control register 1) of the coprocessor CP15.

CONTACT US

Contact: Manager Xu

Phone: 13907330718

Tel: 0731-22222718

Email: hniatcom@163.com

Add: Room 603, 6th Floor, Shifting Room, No. 2, Orbit Zhigu, No. 79 Liancheng Road, Shifeng District, Zhuzhou City, Hunan Province

Scan the qr codeClose
the qr code