Posts

Project Part 3

Image
Alright, let's start off with the most relevant function: mainSimpleSort (50%, 28%) Right off the bat, there's a tiny bit of hoisting that was half-done but seems to have been forgotten. As you can see, lo + h - 1  is loop invariant. lo + h   is actually calculated outside the loop at the very beginning of the function, but   it is used as a counter. I will create a separate variable called loHi_1  and hoist this calculation. This hoist can be repeated three times, since there are three separate instances of this condition in three separate loops. Another hoist we can do is move the v+d  calculation outside the loop. I've repeated the same procedure, storing this result into a variable called  v_d . At this point in my blog post, I would like to thank the person who created the bZip2 makefile. At the end of the makefile, there are six small sample files that are run through the bZip2 function suite (decompress, compress, recompress) and their res...

Project Part 2

Image
For the second phase of the final project, we have been instructed to profile our project and identify our optimization targets. This went terribly, and then very well. Profiling an application requires a few steps: Build a version of the app with the -Og flag Run the test case Parse the gmon.out file Make sure the results make sense Building the application was as simple as modifying the makefile, which went off without a hitch. I assumed that running my full run-tests.sh script on a profile-compatible version of the application would be the best approach. This was wrong for two reasons, which quickly became three reasons. If you remember (I don't blame you if not), the run-tests.sh  test case takes approx. 13 minutes and 20 seconds to execute completely. bZip does not spend an equal amount of time on each file, these 13 minutes are mostly consumed by the three focus-tests  that I defined in my previous post. Compiling with the -Og flag reduced optimization, which c...

Project Part 1

Image
For the first phase of the final project, we have been instructed to find and build an open source software package, as well as benchmarking the software on both x86_64 and AArch64 machines. We need to find a test case that takes a minimum of four minutes to complete, and prove it's consistency. I'll start this post with a bit of my project discovery process. Since we worked with a volume adjustment program in class during our benchmarking lessons, I decided to work with a software package focused on audio. Coincidentally, i've been playing around with Audacity in my free time, and given that Audacity is open source it was a strong candidate for this project. Audacity leverages the portaudio library, which was my initial choice for this project. Portaudio uses the ALSA  library for it's functionality on Linux, which became my second choice for this project. Finally, after trying and failing to build both previous libraries, I settled on a dependency that they both...

Lab 5 x86_64 Update

This post will serve as a follow up post to the previous. The second part of lab 5 was fairly straightforward, and therefore this post will be fairly short since most of the content was covered in the previous post. The logic between the AArch64 version and x86_64 version has barely changed, the main notable difference is the use of the rax and rdx registers as the division quotient and remainder. This also removes the msub logic required on the AArch64 version, meaning that the code is very slightly smaller. Finally, instead of using the 48 decimal value to convert to ascii, I opted to use the hexadecimal 0x30. This produces the same result, but I decided on hex just to prove that I can do it. You can take a look at the x86_64 version  here .

Lab 5 SSH & Update

Lab 5 is the first lab where we are required to program on an x86_64 or AArch64 processor. Naturally, this will present a moderate challenge as we will be working with a more complicated instruction set. Before we begin writing any code, we will have to access an x86_64 or AArch64 machine. Since the machines provided by the college are mostly AArch (with only one of the machines being x86_64), AArch seemed like a good place to start. Before I write about the actual lab content, I would like to detail some entertaining details of the prior class where we actually accessed the machine. At the beginning of the semester, one of the firsts tasks that the class was given was to generate a pair of SSH keys and send the public key to our professor. This would be used in the future to enable our access to the college machines, where we would write code for this lab. However, when we were instructed to access the machines in class, pretty much everyone with a machine had either misplaced or fo...

AArch64/x86_64 quiz & Lab 4 Info/Update

This post is going to serve as an info dump for the previous few weeks in class. I've had a pair of WIP posts for the AArch64/x86_64 quiz and Lab 4 sitting in my drafts, so i've decided to merge them. The AArch64/x86_64 quiz went very well. Prior to the quiz, I met with some of my peers for a study session regarding the previous week's material. While the x86_64 architecture is significantly more confusing than the AArch64 architecture, both architectures have similar functionality i.e. accessing registers in 8, 16, 32, 64 bits, safe registers to avoid trampling, and specifying which register to store results. Outside of the architecture material, we also spent some time reviewing bit-flipping with XOR(EOR). Even though it was not on the quiz, I greatly enjoyed this topic. Working with logic gates was my gateway to programming, and using the exclusive-or gate to flip bits was an interesting miniature logic puzzle. I'm hoping to have XOR/EOR appear on a quiz later in t...

Lab 3 Progress

Over the last two weeks, I have been collaborating with my peers for lab 3. I have chosen the two-digit decimal display as my lab, I believe this task to be just enough of a challenge to push my understanding of the assembly language. So far, I have familiarized myself with the DCB opcode, as well as reading from a set of DCB and printing to the screen. Currently my lab 3 prototype declares a zero and a one, and can display either. I have also began to work with keyboard input, reading a keypress, comparing against a hex value, and executing an instruction based on that keypress. I believe that these two things are the backbone of this task for lab 3. The final unknown for lab 3 is a bug that occurs in my current code. When I draw my initial value (00) on the bitmap, the second 0 (the rightmost digit) prints a 1 directly below it. I believe this happens because my iterator for the amount of rows printed for the second digit does not know when to terminate, and continues printing ev...