Endianness article leads to some questions
When I write an article, I am generally talking about how to do something or how something works in embedded software. I try to look at all the angles and consider the starting point of all possible readers. I do my best, but it is inevitable that I will fail. That is OK. The result is that people write to me and ask questions or request that I fill in the blanks. I am always pleased to receive questions by email or via social media, so keep them coming.
Recently, I posted about the publication of an article on Endianness. I thought that I had covered all the angles, but it seems not …
I received a couple of questions from an embedded software engineer, who requested that he was not identified in this posting. I am not sure why he wanted anonymity, but I would always respect such a request. [Actually, if you ask me a question, I will do my best to answer it directly and might ask you permission to use the discussion in a blog. At that time, I would ask you if you wish to be identified or not.]
This guy essentially asked two questions, starting with this code:
unsigned int n = 0x0a0b0c0d;
unsigned char c;
c = (unsigned char) n;
Question:
Could you please elaborate a bit on how exactly n, which is 4 bytes, gets copied to c, which is only a single byte wide ?
I dug on the net to find out that we’re truncating some data here. But how exactly that truncation happens in micro steps is not clear.
I mean, does the value of n gets copied to a buffer and then truncated and then copied to c or how?
Answer:
The exact steps depend somewhat on the compiler and the CPU [instruction set] in question. What is most likely is the n is loaded into a [32-bit] register. Then the least significant byte of this register is stored into c. This process is not sensitive to endianness at all, as the least significant byte is always the least significant byte.
More code for the second question:
union e { unsigned int ui; unsigned char a[4]; } f;
f.ui = n; printf("a[0] = 0x%02x\n", f.a[0]); printf("a[1] = 0x%02x\n", f.a[1]); printf("a[2] = 0x%02x\n", f.a[2]); printf("a[3] = 0x%02x\n", f.a[3]);
Question:
I want to know how each memory block gets to be referred as f.a[0] or f.a[1] etc. What I understand is, the union will have shared memory. After statement f.ui = n; the data from n will be copied on 4 bytes space of union (which is labelled f.ui) as a 4 byte data chunk (assuming word-size is 4 bytes) Now, when we print the char array indices, how does the compiler resolve that first byte is f.a[0] , f.a[1]…..
Answer:
As you said f.ui occupies 4 bytes of memory. The array f.a[] occupies the same 4 bytes of memory. So, each bytes of the array f.a[] is assigned one of the [4] bytes of ui. The order of this assignment depend on the endianness of the CPU. On a little endian device, the output of this code would be:
a[0] = 0x0d a[1] = 0x0c a[2] = 0x0b a[3] = 0x0a
I do hope that this clarifies matters. Do keep the questions coming!