Making Smallest Possible Linux Distro (x64)

Nir Lichtman4,807 words

Full Transcript

let's go ahead and make the smallest possible Linux drro for this we're going to need a couple of components the main one is going to be of course the kernel itself but we also need a bootloader and some kind of user space but we're going to start here with the kernel we want to go ahead and build a very small kernel as you can see over here I have cloned the Linux repository and I'm going to put information about all the setup in the description First Command I'm going to run here is make [Applause] help and it'll show me a bunch of options I can use to pass the make and specifically you can see it also includes configuration options like this is the default configuration for x864 I'm going to look specifically for an option that is called tiny config you can see that this configures the tiniest possible kernel so this going to be the base of what I'm going to use and to use this I'm just going to run make and then tiny config now I can see the configuration is written just going to make a couple of changes so I'm going to run here make and then menu config first of all I want to turn on the 64-bit kernel cuz this video is going to focus on 64-bit Intel afterwards we're going to need to enable a couple of stuff so I'm going to go here to General setup and first of all I want to go ahead and enable init ramfs support this can be important to have support for the user space that we're going to build so I'm going to check this and we can go ahead and uncheck all these options no need for the support for the special compression types now I'm going to just scroll down a little bit more and make sure this is enabled on your setup and I'm going to press enter on this configure a standard kernel features and over here I'm going to scroll to the option that says enable support for print C and I'm going to enable this so we can see actually messages that are coming from the kernel while it's booting now let's exit from here next I'm going to scroll down here to drivers device drivers and here I'm going to scroll until character devices and I'm going to enable TTY here TTY stands for teletype this is important so we get actual text on the screen now I can finally exit and let's go inside of the executable file formats and here make sure to enable kernel support for Elf binaries this is important cuz we want to run programs on the Kernel now we can exit and save the changes now let's go ahead and run make - j4 this is going to split the make for four jobs it'll make it faster cuz I have multiple cores on my computer okay nice now we can go ahead and test this image to see that it works I'm going to use qmu for this to emulate a computer that is booting up I'm going to use the minus kernel flag because we're passing in a kernel image and we can see the kernel is booting up nicely and the kernel finishes by searching for the init process it's trying all kinds of paths over here and finally it's giving up and panicking because it didn't find any working in it no working in it found let's go ahead and fix this but first of all I want to show you how small this is how small the kernel image is check this out it's only 781k now let's go build the user space I'm going to build a very very minimal shell it's just going to be a program Runner I'm going to start by a C file that I'm going to call shell. C I want to start by writing something on the screen I'm going to write the prompt of the shell so I'm going to use the system call that is called right first argument is the file scriptor and that's going to be one for STD out we're going to write it to the console after it's the buffer that I want to write and that's just going to be the prompt so something like this finally the count of bytes and that's just two bytes and afterwards I'm going to read some input from the user that's going to be the program to run and for that I'm going to use the read system call by the way I'm just going to remind that the code in this video is just going to be for fun and it's not going to be for production purposes I'm going to skip all kinds of important stuff like error checking just to make everything here as concise as possible Reed has a very similar interface also file descriptor buffer and size just it's the opposite direction this time we're going to use file descriptor number zero for SD in we're going to get input from the user afterwards the buffer let's just call it command and the size let's say 255 bytes so we're just going to put a fixed limit on the size of the command I'm going to define the command over here now let's take a look at the return value of read we're going to assume success here so on success number of bytes red is returned so I'm going to save this return value let's call it count and now what I'm going to do is I'm going to replace the new line in the end with a null Terminator let me explain let's say you put the command for example you want to run LS so you run /bin /ls but the read system call is going to return this and then a new line afterwards so what we want to do is we want to turn it into /bin /ls and then a null Terminator afterwards to terminate the string now the trick to do this is I'm going to go to command and I'm going to go to count minus one and that is where the new line is going to be and I'm just going to place a null Terminator over there nice so now the command is ready so we can go ahead and execute it and for this I'm going to use the exit V system call First argument is going to be the path name and that's just going to be command afterwards we have a nullable arv so I'm just going to pass here zero not the best practice but again this is just going to be very simple example and same for the NP and now we can just go ahead and test this out to see how it works this very very simple program Runner that will run one program and then exit let's build this with GCC and I'm going to use here for example /bin Lis and nice we can see it works nicely but notice that just exits after running this we wanted to continue receiving commands I'm going to put all of this in a forlo loop but this is still not enough cuz running execv is going to replace the running process with the new process that is specified over here so this will replace the shell entirely we need to fix this we're going to use fork for that this will Fork a new process and then we can run exit V in the child so we can see fork creates a child process we're going to do that before the exit V and a Returns the type I'm going to talk about the return value soon but let's go ahead and save it right now and now let's talk about the return value on success the p idea of the child process is return in the parent and zero is return in the child this is important because this is how we're going to know we're the child so if Fork result is zero then we're going to exec fee else again in this video we're not going to go ahead and handle the failures so we're just going to assume that else we're in the parent what do we want to do in the parent and the parent want to go ahead and wait for this to finish now this is the child over here so for this we're going to use some kind of weight mechanism now one more thing I want to do here is I'm going to break from the loop in this line remember this is the code that is running in the child reason for that is that if the command is good and exit V succeeds it's going to successfully replace this process with something else as specified in the command but if this fails it's just going to continue executing the code afterwards and I want it to break from the loop if it gets here so the child won't become also a shell now let's talk about this weight over here so we can see that we have quite a couple of options that we can use for weight but I'm going to say something about this even though this is in the system calls manual Category 2 these are not actual system calls in the kernel the only one here that is actually in the kernel is weight ID and I'm going to use the one that is actually in the kernel and you're going to see soon why I want to stick with that and not use these ones which just translate to something else that is eventually going to go to the kernel so let's stick with weight ID now I can see we have something else to include here so I'm going to copy this now let's take a look at the arguments of weight ID this is how it looks like the signature I'm going to go ahead and copy this and let's read about what we need to pass to these parameters over here to make it work I'm just going to talk about my goal with this function remember that this flow is happening in the parent and the parent wants to wait for the child to finish executing the command so what I want to do with wait is I want to wait for the child to finish so it's going to pause the parent until the child's finished so the parent will know when to prompt for the next command now we can see that it starts by talking about the ID type and R ID arguments and we can see we have here an option to wait for a specific child we're just going to make this as simple as possible and go ahead and wait for any child and that's going to be P all and remember this goes into ID type and then ID is ignored so we can just pass in zero for ID afterwards we got this sig info structure not going to do anything much with it but we need to pass it so I'm just going to Define it here and then pass in the pointer and finally we got some options and let's take a look at the options over here we have here a couple of flags that we can specifying options and we're just going to wait for children that have terminated so this flag over here W exited okay and now we're ready to test out the shell [Applause] again Works nicely and we get back the prompt nice but let's take a look at a problem before we can use this for our drro first of all you can see that the executable is 16 KOB which is not bad it's pretty small but here's the problem this is a dynamic executable if I run ldd on the shell we can see that this is not a static executable it requires for example the lib CSO shared object so what I'm going to do to fix this is I'm going to use the same GCC command but pass in minus static but check out what happened now it's suddenly 750k this is quite a large size for such a small program but at least it's not a dynamic executable anymore so this is fine this will still work nicely but let's go ahead and make this even smaller well for this we're going to need to write a little bit of assembly what we're going to do here is we're going to make the system calls by ourselves not through the C library we're going to implement these functions write read exec V and finally weight ID let's start with write so I'm going to open a new assembly file and let's call this file CIS dos because this will be our connection with the system I personally prefer the Intel syntax I know that in the Unix world there's the AT&T syntax which is really prevalent but I find the Intel syntax much more clear and we are going to use the new assembler so I'm going to start with using the Intel syntax directive and let's also open the info page for as which is the G assembler and I'm going to open the entry about Intel syntax by the way I open this search here by pressing on I so you can see it talks about the Intel syntax here selects Intel mode we also have an optional argument which is called prefix or no prefix and I'm going to use no prefix because this will specify that we don't require the percent prefix on the registers afterwards I'm going to make my symbols Global so the Linker will know about them and we're going to start just with right for now now let's Implement right what is wri going to do we're going to want to select the system call we don't know the number yet but we're going to find out soon we're going to make the system call by running the CIS call instruction and finally I'm going to return back now let's see how we can find out the number of the system call for this we're going to go to the source code of the kernel afterwards to the arch directory architectures x86 include generated ASM and then over here to the CIS call 64h file and here we got the numbers from the kernel itself so as we can see right is number one now let's closes a deal for WR but let's go ahead and also Implement read as you can see over here read a system call number zero now let's do this for fork Fork is 57 and finally for weight ID which is 247 but hold on for a second we're going to have a little problem here notice that write uses three arguments read uses three arguments and Fork doesn't have anything but in weight ID we have here four arguments now we got a couple of problems with the weight ID function which we need to fix first one is that the actual function in the kernel does not have four parameters but it has a fifth parameter and we can see this in the documentation if I open the man page for weight ID notice this little comment over here this is the gipc interface see notes for information on the raw system call so let's go ahead and see the notes about this so I'm just going to search for weight idea across a document the raw way ID system call takes the fifth argument of this type and we can see that this argument can be null so that's what we're going to do I want to pass in the fifth argument and just going to be zero now what I'm going to do is because it's going to be have now a different signature than from the standard Library I can't use the header files for this anymore so I'm going to use this as a prototype so I'm going to copy this and put it over here now we're just declaring that this exists we're going to Define this in the Assembly of course but I'm going to call it real and then wait ID and put here the fifth argument that they're talking about over here now as you can see this fifth argument is a pointer since I don't care about this argument we're just going to put here void and then start and now let's go back down here and instead of just calling weight ID we're going to call real weight ID and we'll go here to the assembly change the name here as well let's make all these Global now one more thing about this weight ID thing notice that now we have five arguments I want to talk about something as you can see over here I'm in Wikipedia in the x86 calling conventions page and over here we can see that the system 5 ABI of amd64 ABI by the way stands for application binary interface and these are the registers that are used for parameters in the system 5 ABI these are not the exact ones as the kernel uses the kernel uses slightly different registers you can see that they're almost the same RDI RSI RDX but it doesn't use rcx for the fourth argument it uses r10 instead so we're going to need to fix that before making the system call so we're going to take whatever is in rcx and transfer it into r10 let's fix that we're going to take whatever is in rcx and put it in r10 and this will fix this up CU remember C is calling this function as it's a regular C function in your program but we're going to make a system call here so it's slightly different interface if we want to know the reason behind this if we take here a look at the Intel spec this is the CIS call instruction that we just used and you can see at the CIS call instruction uses rcx for saving the return address of where to come back after the system call so this is the reason it can't use rcx it's already used by the CIS call instruction in the CPU nice so we got rid of the dependency upon the lib C library let's go ahead and try this out so now it's going to be a little more complicated to build this we're going to go through it step by step first of all is delete the shell binary we just built and notice first of all we have two files over here first one is a C file and then the assembly so let's start by just compiling the C file GCC minus C compile only shell. C and now we have shell. o the object file and now we're going to assemble the assembly for this I'm going to use as and just pass in the assembly notice that it calls it a.out so this is the built assembly and this is the built C code but it's still not ready to run we need to link them together and for this I'm going to use the LD Linker we're going to call the output shell I'm going to pass in both of the object files so first of all shell. o and then a.out I'm going to pass in minus- entry main cuz our entry point is called main remember that's the entry point for our C code and also something important I'm going to need to specify this parameter over here minus Z and then no exit stack and this will tell Linker that I don't want to have an executable stack looks like we forgot to define the exit V system call in our assembly so let's fix that real quick so right after here read I'm going to Define also exec V and let's search for exit V over here and that's number 59 nice so we're done with this let's check it out again this time I only need to assemble this again and then run the LD command as before and I forgot to make it global global makes it visible to the Linker now there is one more thing I want to fix here in the code notice that after exit V over here in case it fails we break but where does it break into well this for Loop over here it goes the execution of the code goes right over here but since our entry point is now called main we don't have a C library anymore so this is the actual entry point of the executable not some entry point by the C library we need to take care of exiting from the program so at the end of the main function over here we're going to call the exit system call and pass in zero for example now we need to also implement this in the assembly as we did with the other system Calles so let's open up the assembly file sis. s add here exit as well and I'm going to open the file from the kernel again and here it is CIS exit is number 60 technically no need for this return yeah we can remove it cuz it's no return now let's go ahead and build it again as we did before so I'm going to start by running GCC minus C on shell. C afterwards we're going to assemble this and then I'm going to run the LD command that we ran before my bed also the global we need to first of all assemble this again first and then we're going to run LD now it works nicely now let's check it out just real quick and nice the shell Works nicely let's check out the size for a sec as you can see it's only 9.1k very small now let's go ahead and pack this into a init ramfs that we can use with our dis Dr but first of all I want to go ahead and change the name of this to init cuz this is the default name of the init process and now I'm going to create an archive with this file I'm going to start by echoing the file I want to include in the archive echo in it then I'm going to pass this into cpio minus H new C this is the format that the kernel supports minus o creates new archive and I'm going to pass this into net. cpio and now we have the archive over here now let's go back to the Linux source code I'm going to run here make help again this time let's take a look at a interesting option here that is called ISO image this creates a boot cdrom image and it makes it at this path over here we can even specify Arguments for the booted kernel with fdrs and the init ramfs right over here so let's use this I'm going to start by specifying the FD ARS make ISO image let's start by specifying the FD arcs here I'm going to specify the path of the init ramfs on the dis image itself which is going to be at slit. cpio as we call it ourselves I'm going to do this by specifying init Rd equals slit. cpio next I'm going to specify FD init Rd and here we're going to specify the path on our host machine of where where it can find the cpio that we just created and this is going to be right over here and let's run this nice and we can see the iso image is ready so I'm going to copy this and we're going to use qmu again to test it out this time going to use the CD ROM [Applause] option nice and it's booting up and we can see the shell here is working nicely but we can't do anything right it's going to just do nothing because we don't have anything here except the shell itself let's finish off by putting Lua in our drro the reason I'm choosing Lua is that it's a very minimal programming language that is easy to build from scratch and we're going to make a static build of Lua and then we're going to put it inside of the drro so let's click here on download and I'm going to copy the path to the tar gz over here copy link now let's go back here to fun and let's W get that path nice and now I'm going to go ahead and extract this now I got the Lua directory over here and I'm going to go here to Source we can see we have here the source code of Lua I want to specifically open the make file because I want to make it a static build so I'm going to scroll down here a little bit and I'm going to specify custom LD Flags this will be custom flags for the Linker and I'm going to pass here minus static that's it for the make files I'm going to save this and now let's just run make nice so now we have a static build of Lua we can just make sure by running ldd and then Lua and here it is not a dynamic executable let's just make sure that it works nice now let's go ahead and put this inside of the dis Dr so what I'm going to do is I'm going to move this Lua executable up to my fund directory now I got it over here and now I wanted to pass both init and Lua Into the cpio Now files looks like this and now we're going to create a cpio archive from these files same flags as before by the way the size of Lua static build is 1.4 Mega which is pretty big but remember that Lua does require heavily the C library there are some optimizations that we can do like using maybe a different library but this is out of the scope of this video so now we got the cpio ready let's make another ISO image for that let's go back to the kernel and let's run make ISO image again as we did before with the FD ARs and the FD init Rd and now we have here the iso image ready so we're going to test this out with q nice so we have our shell over here but let's go ahead and see if Lua works so let's run here Lua and nice we got Lua in our custom dis R pretty cool now let's just go ahead and see that we can also run this with virtual box so let's close qmu over here and I'm going to copy this ISO image to my host computer and let's open up virtual box now let's create a new virtual machine now I got the iso image so I'm going to specify it over here this is the path here we're just going to choose other Linux I'm going to put a very low amount of memory to show you that it works with only 32 mbes and we're not going to add any virtual hard disk just going to be from memory now let's go ahead and start this nice then we can see it works we got the shell over here let's see that Lua Works Lua is working nicely let's do something a little more interesting here so let's make a for Loop nice everything here is working nicely by the way before we finish off let's let's just take a look at what is the size of the iso image right now so with Lua this is a size 2.7 megab but let's see how big it was without Lua so I'm going to go back here run the same command that we did just within it without luup and now let's run make ISO image again and this is the smallest dis row 1.3 megabytes and by the way if you're curious the iso image option here uses a boot loader that is called ISO Linux

Need a transcript for another video?

Get free YouTube transcripts with timestamps, translation, and download options.

Transcript content is sourced from YouTube's auto-generated captions or AI transcription. All video content belongs to the original creators. Terms of Service · DMCA Contact

Making Smallest Possible Linux Distro (x64) - YouTube Tra...