2

I am writing a program in C (32 bit) where I output a string (15 to 40 characters long). I have elected to use pointers and calloc instead of a formal array declaration. My program functions totally fine so this isn't a question about logic or function, I am simply curious about what's "going on under the hood" of my C code.

My understanding: When I use calloc I am allocating a section of memory in units of bytes. Variables are stored in memory locations of size 32 bits (or 4 bytes). In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next memory location.

My question: If memory locations are 32-bits and I am writing only 8-bits to that memory location, are the remaining 24-bits unused? If not, then are the pointers I'm using pointing to some kind of 8-bit sub-memory location, pointing to 8-bit sections of memory locations?

MCHatora
  • 85
  • 8
  • 1
    By "registers" you mean "memory locations"? – Eugene Sh. Feb 22 '17 at 16:56
  • 1
    You write _Variables are stored in registers of size 32 bits (or 4 bytes)._ Not true, it's rather _Variables **may be** stored in registers of size 32 bits (or 4 bytes)._ – Jabberwocky Feb 22 '17 at 16:58
  • 2
    "If registers are 32-bits and I am writing only 8-bits to that register, are the remaining 24-bits unused?" is an assembly issue and not specified nor controlled by C. Many possibilities exist. From a C point-of-view-, it is irrelevant. – chux - Reinstate Monica Feb 22 '17 at 17:04
  • Bear in mind too that in C `'!'` is of type `int` not of type `char` so the full register width will be used leaving no unused bits. – Weather Vane Feb 22 '17 at 17:07
  • 1
    You don't wanna know. Cos you shouldn't care. – Jens Feb 22 '17 at 17:13
  • Misunderstanding. One byte has 8 bits. If you want allocate 16bytes, you call calloc(16,sizeof(char)) and what you get is pointer (address) and so this pointer is firstly stored in register (the register is 32 or 64bit wide). If you want, you can stored it in memory or anywhere so you can access it later. When you call *ptr++='a', you simply write one byte (8bits) to the address stored in 'ptr' variable and increment that address by one byte (of course ptr must be have type of 8bit integer pointer -char - which is byte). – ttdado Feb 22 '17 at 17:13
  • To be clear, "Variables are stored in registers of size 32 bits (or 4 bytes)" is incorrect. Variables are stored in _some_ kind of memory. A variable may take up up only 1 byte, 15 bytes, 42 bytes, 8675309 bytes, etc. This is independent of the size of of the processor's registers. A write to n-bytes affects those n-bytes. If the write affects other memory, that memory is not used at that time and is irrelevant. – chux - Reinstate Monica Feb 22 '17 at 17:30
  • Your question is mixing up too many things. You should first look up what *hardware registers* really are. Then you should look up the `register` keyword for C, which is something completely different. – Jens Gustedt Feb 22 '17 at 17:36
  • Yes, I ignorantly used incorrect terminology here: there should be no references to registers. What I really mean is memory locations. I apologize for the confusion. – MCHatora Feb 23 '17 at 17:33

6 Answers6

4

Register usage -- and, technically, even the existence of registers at all -- is a characteristic of the C implementation and the hardware on which it runs. There is therefore no definitive answer to your question at its level of generality. This is for the most part true of any question about "what's going on under the hood".

Speaking in terms of typical implementations for commodity hardware, though,

My understanding: When I use calloc I am allocating a section of memory in units of bytes.

A reasonable characterization.

Variables are stored in registers of size 32 bits (or 4 bytes).

No. Values are stored in registers. Implementations generally provide storage for the values of variables in regular memory, though those values may be copied into registers for computation.

Under some implementation-specific circumstances, certain variables might not have an associated memory location, their values instead being maintained only in registers. Generally speaking, however, this is never the case for variables or allocated space that is, was, or ever could be referenced by a pointer.

In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.

No, absolutely not. Incrementing the pointer causes it to point to the next element of your dynamic storage, measured in units of the size of the pointed-to type. This has nothing to do with registers. Writing to the pointed-to object probably involves register use (because that's how CPUs work), but ultimately the character written ends up in regular memory.

My question: If registers are 32-bits and I am writing only 8-bits to that register, are the remaining 24-bits unused?

As I already explained, this question is based on a misconception. The target of your write is not a register. In any case, there are no gaps in memory between the elements you are writing.

It is conceivable that under some circumstances, a clever compiler might optimize your code to minimize writes to memory by collecting bytes in a register and performing writes in chunks of that size. Whether it can or will do so depends on the implementation and the options in effect.

If not, then are the pointers I'm using pointing to some kind of 8-bit sub-register allocation, pointing to 8-bit sections of registers?

Your pointers are (logically) pointing to main memory, which is (logically) addressable in byte-sized units. They are not pointing to registers.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Well my answer got oblivious before I send it^^. +1 for this very in-depth answer – Kami Kaze Feb 22 '17 at 17:38
  • Thanks for the detailed response. I have completely used incorrect terminology here because I am ignorant. All references to the term "register" should read "memory location". So memory locations address bytes not words? – MCHatora Feb 23 '17 at 17:26
  • 1
    @MCHatora, as you see, correct terminology is essential for accurate communication. Even substituting "memory location" for "register", however, you still have a misunderstanding, addressed at the very end of my answer: in C, memory locations are (logically) *one* byte in size, not four. This has nothing to do with the underlying architecture's native word size. – John Bollinger Feb 23 '17 at 17:34
  • Yeah, this was the *real* question I had: how many bits does a memory address "contain" (which is 8). Thanks for this response! – MCHatora Feb 23 '17 at 18:14
1

Those pointers are not certain to be stored in registers, normally they will be just stored on the stack. This is an outcome of the compiler optimizations. In some compilers you can use the register statement to ensure usage of register.

Also, there is no "next" registers, registers does not have addresses. Register file is a special hardware unit integrated to the cpu and usually named by a certain set of bits.

I advise you to use your compiler or disassembly tool to see exactly how it looks in assembly.

  • 1
    The `register` keyword is normally ignored nowadays. See [this SO question](http://stackoverflow.com/questions/578202/register-keyword-in-c) – Jabberwocky Feb 22 '17 at 16:58
  • 2
    `to ensure`-- nopes, not at all, compiler are free to, and likely to absolutely ignore the `register` keyword. – Sourav Ghosh Feb 22 '17 at 16:59
  • 1
    `normally they will be just stored on the stack.`, sadly, there's no concept of `stack` in C standard, that's upto implmentation. – Sourav Ghosh Feb 22 '17 at 17:04
1

Nopes, there's no register involved, in general, they are scarce resource.

What happens actually is, you are writing the values in the memory locations pointed to by the returned pointer. The pointers and pointer arithmetic regards data type, so the returned pointer, casted to proper type, takes care of access.

I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.

Not exactly, you are talking about memory location pointed to by the pointer ptr. In case, ptr is defined as char *, ptr++ is the same as ptr = ptr + 1, which, increases the ptr by the size of the pointing data type, char. So, after the expression, ptr points to the next element in the memory location.

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
0

You can specify in c that a var goes into a register, and most compilers will optimize this, but where the var goes depends on what kind of variable it is. Local variables will go on the stack, memory allocation functions should put it on the heap and give you the address. Constants and string literals will go into the read only data segment.

NathanAck
  • 351
  • 3
  • 9
0

As Sourav pointed out you are using registers wrong. There is a memory called register and there is a keyword register in C. But this has not much to do with pointers.

The typical size for an aligned memory block is 16/32/64bit depending on your architecture. You are thinking that you increase your pointer by that blocksize. This is not correct. Depending on what type of pointer you have, your stepsize on incrementation differs. It is always the size of your corresponding data type in bytes.

*char gets increase by 1 byte if you do ++ while *(long long) gets increased by 8.

As arrays can decay to pointers on some occasions, the mechanics are quite similar.

What you think of is what happens if you declare two char (or a char and an int in a struct), their addresses differ by a multiple of the blocksize and the rest of the memory is "wasted". But as you allocated the memory it is yours to control, you can pack it similar to an array.

Kami Kaze
  • 2,069
  • 15
  • 27
0

There seems to be confusion about what a register is. A register is a storage location within the processor. Registers have different functions. However, programmers are generally concerned with GENERAL REGISTERS and the Process Status Register.

General Registers are scratch locations for performing computations. On some systems all operations are performed in registers. Thus, if you want to add two values, you have to load both into registers, then add them. Most non-RISC systems these days allow operations to take place directly to memory.

My understanding: When I use calloc I am allocating a section of memory in units of bytes. Variables are stored in registers of size 32 bits (or 4 bytes). In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.

Your compiler may assign variables to exist in registers, rather than memory. However, any time you dereference a pointer (e.g. *ptr) you have to access memory.

If you call

char *ptr = calloc (...) 

The variable ptr may (or may not) be placed in a register. It's all up to your compiler. The value returned by calloc is the location of memory, not registers.

What you should do to learn this is to generate assembly language code from your compiler. Most compilers have such an option and they typically interleave your C code with the generated assembly code.

If you do:

In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.

Your generated code might look like (assuming ptr is mapped to R1):

 MOVB '!', (R0)+

Which on several systems, moves the value '!' to the address pointed to by R0, then increments R0 by one.

My question: If registers are 32-bits and I am writing only 8-bits to that register, are the remaining 24-bits unused? If not, then are the pointers I'm using pointing to some kind of 8-bit sub-register allocation, pointing to 8-bit sections of registers?

In your case, you are not reading and writing bytes to registers. However, many systems do have REGISTER subdividing.

user3344003
  • 20,574
  • 3
  • 26
  • 62