SPO600- Project Phase 3

For this post I will be discussing spo600’s final project’s Phase 3 Submitting a patch.

Unfortunately for this phase there were a lot of different problems that arose for me. Initially I was designing a script around a text generated list of combinations. In this text was a generated list of all combinations of a set of 0’s and 1’s. In my script they would represent a group of options that would be given to gcc, a 0 in a specific location would represent a group of options being turned off. Switching these on and off would ideally have an effect on the compile/execution times of a finished program. A few problems arose form this. First off, this method of reading each of these from a file would be quicker in python, but makes for some highly specific code. Not as though this code would be used for anything else, however changing the size of the groups would have a few direct impacts, having to regenerate this file and tweak it using text generators each time is cumbersome, as well as not being too accurate for pasting over any set of combinations specific enough to b able to narrow down any sort of data. A combination set with 18 groups (each with 10 values and one with 7) for instance has greater than 200,000 combinations, and this is still a pretty broad kind of testing system. Since realizing that this would be terrifying to debug because a regular expression inconsistency per line would result in a lot of loss of work, as well as manually sorting through hundreds of thousands of values to find something amiss would take time. to say nothing of re appropriating this code for different group sizes in terms of the loops and expressions kind of made me want to take a step back and try something else. So I’ve started to try and use pythons itertools.product module, which I have finally been able to generate the same list I was using before by adding a repeat option to the function. However in the time that is left in the course since making that change has not been enough for me to completely finish the script in order to grab the proper values. (a long with being able to choose your group size and let it generate things for you)

Secondly, I guess I kind of had a problem with one of the main concepts of my project. I was initially attempting to read the output of a compiled testing suite that comes in box with php. My flawed understanding while programming this would be to use the test in order to help me narrow down a section of code after running my script to see an area that would be heavily effected by the optimizations I was implementing. However, the gmon,out after the test would only provide output from the final test initiated in the script, and was much too low execution time to be able to gain any use from it. So this left me searching for a section of php code (or even the creation of a small algorithm) that would take long enough execution time to register the gmon, admittedly a less than difficult task, but with such a revelation of going the absolute wrong direction in my thinking coming so late I have been unsuccessful.

Thirdly, working with most of this stuff in php seems to be creating cascading problems. As discussed in previous posts about my project being able to create a gmon is kind of a toughie. Firstly, the only way I have been able to create it is doing a fresh configure of my system, adding -pg to the lines I specified previously, adding a set_time_limit() function to any script that you would like to run, as well as adding it to a few natural files before running make. Changing these values then re issuing a make has no effect. So this brings up a few things. Not being able to enable optimizations inline means having to write to the make file after every config, and a regular expression for that is difficult and highly situational. Another problem is the question of reconfiguring every time, will the benchmark remain similar? Will this effect further operations?

I think if I were to restart this project knowing what I know now I would attempt to perfect the script on a meaningless benchmark script, to be able to change group size to  make it easier for me to change to different sizes, which would give a higher level understanding of the effects of these compilers, as well as help me understand what is happening if there are odd edge cases present. It would also be neat to be able to add your own prefixes or option groups in case you were compiling with another program. The next step would be to find a clear area to bench mark, and attempt to get a few baselines before struggling uphill to effect it dynamically. Trying to tackle all three of these basic problems ended up leaving me in confusion even after seeking out clarification on them. Addressing a small problem on each front at a time while asking people with more experience only served to give me bigger questions.

Advertisements

SPO600 Lab 2 – Baseline Builds

The object of this lab is to create a baseline build from a list of available programs, in a situation that will be consistent and repeatable. For this lab I have selected to benchmark PERL.

I decided to download a tarball from PERLs website using the command

wget http://www.cpan.org/src/5.0/perl-5.20.2.tar.gz

I will construct an identical version on both red and australia and record output using different Make -j options for comparison.

In order to install Perl the following commands were input

sh Configure -de

make

make test

DESTDIR=$HOME/bin makeinstall

After adding –pg to the following lines, I was able to produce a gmon.out on both architectures.

CC = cc

LD = cc

Here are some more information about the system that may be useful for replication

./perl –version

This is perl 5, version 20, subversion 2 (v5.20.2) built for aarch64-linux

Copyright 1987-2015, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the

GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on

this system using “man perl” or “perldoc perl”. If you have access to the

Internet, point your browser at http://www.perl.org/, the Perl Home Page.

[lcmartin2@red perl-5.20.2]$ free

total       used       free     shared buff/cache   available

Mem:       16713728     346816     9890496       7936     6476416   16181120

Swap:             0           0           0

SPO Phase 2 – Entry #2

For this entry I will be elaborating on python.

I managed to download and install python into a local directory onto both architectures, red and australia.

For comparisons sake I will show the commands used to create the same testing conditions.

Wget https://www.python.org/ftp/python/3.4.3/Python-3.4.3.tgz

Tar zxvf Python-3.4.3.tgz

Cd Python-3.4.3

Australia

./configure

make

make test

mkdir $HOME/bin
DESTDIR=$HOME/bin make install

I used a local directory on advice from my instructor to avoid using sudo, this also seems like a good way to pre-determine testing conditions. To verify accuracy, input these commands and compare.

In order to access this version, if you installed it in a similar way, issue this command

/usr/bin/python3.4

This will work on both architectures if installed in the same way. This will also be the first line of my python scripts.

To test, I created a small hello world.

#! /usr/bin/python3.4

print (“Hello World”)

In order to execute this script , chmod u+x filename.py

./filename.py

I am still waiting on some feedback from instructor for a few things to finalize, The next post will hopefully detail the Python script I’ve created for my project, as well as the output.
-Liam Martin

SPO Phase 2 – Entry #1

This entry is to document the second phase of the final project, Optimization.

One of my main objectives in this phase is to test all possible combinations of optimizations for the gcc compiler.

Considering that there is 187 different tests in the distribution of gcc that I am working with, I have begun to work on a script.

My instructor mentioned working with Python so i have begun to do some research on that. Considering I am  not as familiar with scripts as I am with other forms of programs, I have come across a few questions I need to solve to proceed (and is where my research is heading)

-How does python work with specific cl arguments ? Can i just sequentially call certain arguments with some programming logic intertwined?

-How is this implemented on to the system ?

-Will this script itself be effected by the architecture?

Maybe some pretty basic stuff that will come early during research, but do need to be addressed

During research I have thought of a few different approaches to the logic of the problem. Seperate from how syntactically I would call these arguments etc, I am figuring out how to test for all combinations.

I believe I will create an array (or two dimensional, depending on exact approach) that is 187 fields long to represent all total options for the debugger. Each field will contain a boolean value (representing enabled or disabled on a grander scheme) that if true, will determine its inclusion in that particular command.

More to come very soon on both of these.

-Liam Martin

SPO600 Project Phase 1 – PHP Project Selection & Analysis

This post is to document the Phase 1 of SPO600’s final project. There are three total phases for this project, being:

1.Identifying a Possible Optiization in the LAMP stack

2. Optimizing

3.Commiting the Changes Upstream

Areas to Optomize

For Phase 1 I will be discussing my chosen topic PHP.

This Link is where I downloaded the php mirror from if you want to download through website:

http://php.net/get/php-5.6.7.tar.bz2/from/a/mirror

Commands Used to Build my version of PHP I will be testing on:

wget -P PHP/ http://ca1.php.net/get/php-5.6.7.tar.bz2/from/this/mirror

tar xvjf php-5.6.7.tar.bz2

scp /home/lcmartin2/PHP/php-5.6.7.tar.bz2 lcmartin2@red.proximity.on.ca:/home/lcmartin2

./configure –prefix=/wwwroot –enable-so

Make

Make test

After being notified about a lack of architecture specific coding for Aarch 64 in a lot of files that do contain that type of code for x86_64, I have decided to to test with the gcc -Q option to determine if there are any combination of compiler optimization flags that can improve performance, or detect a certain area in the code that will greatly benefit from some platform specific code.

This brought me to comparing the gcc -Q -O# –help=optimizers command on both architectures and seeing if they differed.

(The command used here will search through different optimization levels, COunting the amount of Enables and Disabled tests when compiling.)

gcc -Q -O3 –help=optimizers | grep -c -F “[disabled]” > test2
gcc -Q -O3 –help=optimizers | grep -c -F “[enabled]” > test

// Results of Above Statements
[lcmartin2@red php-5.6.7]$ cat test
120
[lcmartin2@red php-5.6.7]$ cat test2
67
[lcmartin2@australia php-5.6.7]$ cat test
118
[lcmartin2@australia php-5.6.7]$ cat test2
69
gcc -Q –O2 –help=optimizers | grep -c -F “[enabled]” > ../ProjResults/O2ena
gcc -Q –O2 –help=optimizers | grep -c -F “[disabled]” > ../ProjResults/O2dis

//Results of above statements
[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O2ena
109
[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O2dis
78

[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O2ena
111
[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O2dis
76

gcc -Q –O1 –help=optimizers | grep -c -F “[enabled]” > ../ProjResults/O1ena
gcc -Q –O1 –help=optimizers | grep -c -F “[disabled]” > ../ProjResults/O1dis

//Results for above tests
[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O1ena
78
[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O1dis
109
[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O1ena
80
[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O1dis
107

gcc -Q –O0 –help=optimizers | grep -c -F “[enabled]” > ../ProjResults/O0ena
gcc -Q –O0 –help=optimizers | grep -c -F “[disabled]” > ../ProjResults/O0dis

[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O0ena
48
[lcmartin2@red php-5.6.7]$ cat ../ProjResults/O0dis
139

[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O0ena

47

[lcmartin2@australia php-5.6.7]$ cat ../ProjResults/O0dis

140

Proceeding

You will notice in all situations “Australia” will have two more flags disabled than red, except for with option -0O, having only one extra. This change in amount leads me to believe that with further tinkering to these options, a more powerful set can be discovered. For phase two combinations of flags will be tested against each other in a script to determine optimization accuracy.

Fulfilling these requirements will possibly lead me to the conclusion that as a whole the optimization flags are as good as they need to be, in that instance I will research more into determining which specific functions may need some architecturally specific designs.

PHP offers a handy testing suite, so changes of this type can be made without concern of accidentally breaking things, allowing for a trial and error approach much appreciated when dealing with broad changes and a large platform. This will also be very beneficial for testing all optimization flags on all possible tests.

Why PHP?

PHP is an extremely established language, and finding information on it is not an unreasonable request, as it is well documented and well discussed in the community. Tutorials on installation and usage as well as easily accessible and well laid out source code on git cuts down on sifting  through files to find what you are looking for. Additionally, PHP is a developing language, with new release versions coming out often, some oversights in well established functionality or new additions may have happened. More personally, PHP is a language that I forsee myself working with more in my career as a programmer, and knowing more about its inner workings could be beneficial to me.

Plans To Upstream

A drawback of what I said about PHP before, as a well established language, contributing will be a bit of a hard task. With many people wishing to make contributions, feedback will be slow if any is given at all. If my results are as I expect them to be, a possibility of making a couple simple changes or implementing some platform specific code that would increase optimization in a meaningful way, I would still have to work hard to get this contribution noticed. That being said, a contribution upstream may take several months so it being accepted may not happen until the conclusion of the course. Ideally I would like to have a well documented submission and get at least some feedback before the culmination of the semester.

Full Contributon guidelines are here:

https://github.com/php/php-src/blob/master/CODING_STANDARDS

https://github.com/php/php-src/blob/master/README.GIT-RULES

https://github.com/php/php-src/blob/master/README.MAILINGLIST_RULES

https://github.com/php/php-src/blob/master/README.RELEASE_PROCESS

-Liam Martin

-Liamhttps://github.com/php/php-src/blob/master/README.MAILINGLIST_RULES

https://github.com/php/php-src/blob/master/README.RELEASE_PROCESS

Lab 4 – Compiling C Code

In this lab we see the effects of a basic program that is compiled and run in different ways. These will include different options applied to the gcc command, and functions in the main “hello world” program.

In this written segment I will be .discussing the results posted here, and giving analysis based on them.

Results will be posted here

https://drive.google.com/file/d/0B9O4WvhvDYsObEFuTFNVSDJpY00/view?usp=sharing

Anyone With this link will view the output from my google drive.

Part 1- Base options

Using this code (Standard hello world )

#include <stdio.h>

int main() {
printf(“Hello World\n”);
}

compiled as gcc -g -O0 -fno-builtin

Base compilation standards that further compilations will be compared to while implementng different options.

Part 2- -Static option (no change to code)

Increased the size of the program dramatically, and results from objdump became extremely large and almost unreadable because of all the allocations. This changed how the functions and main were  allocated in memory, only statically.

Part 3- removing -fno-builtin option using standard code

The size of the executable was decreased by a very small margin in this part. Without this option the main function does one less move of data, and finishes with nopw vs nopl. This change is because it avoids the built in function, which might trade off using an extra mov function for some other purpose.

Part 4- removing -g option using standard code

The start address has been changed, and the size here ahs been reduced. there is no change to the main functionality, Here it seems that leaving out the g option just affected the compiler generating debugging code, making the size a little smaller

Part 5 – moving print f to function call

#include <stdio.h>
void printFunc(char* arg1, int arg2){
printf(arg1, arg2, “”, “”, “”);
}
int main() {

printFunc(“Hello World”, 2);
}

Using this code, the size was larger, and the code was divided into different parts designated by <main> and <printFunc>.

Similar things used in main to the base compilation, just added the printFunc operations.

Part 6 change –o0 to -03

Immediately noticed the size increased. and that two main sections were provided in different sections. I suspect that -O3 includes some more debug options.

Software Portability and Optimization Presentation

Determining the size of pointers and integers on separate architectures

1. The Issue.

When porting between architecture, a lot of assumptions are made about the relationship between pointers and integers. In ILP32 data model, usual pointer and integers occupy the same amount of memory. So often people will cast pointers to integers and unsigned integers in order to perform address calculations. This is a fairly widespread problem, and Both x86_64 with its new ILP64 data models and ARM’s KEIL systems warn developers about doing these types of things.

Often used in a few situations, such as the following

“Some lock-free multiprocessor algorithms exploit the fact that a 2+-byte-alligned pointer has some redundancy. They then use the lowest bits of the pointer as boolean flags, for instance. With a processor having an appropriate instruction set, this may eliminate the need for a locking mechanism (which would be necessary if the pointer and the boolean flag were separate).
(Note: This practice is even possible to do safely in Java via

ava.util.concurrent.atomic.AtomicMarkableReference) “

Via GITHUB user kos

Wikipedia –

XOR linked lists can use an two pointers when checked can be casted to an integer and will end up halving the size necessary

  1. The problem

 

Problem Code from Viva64

char *p;

p = (char *) ((intptr_t)p & PAGEOFFSET);

DWORD_PTR tmp = (DWORD_PTR)malloc(ArraySize);

int *ptr = (int *)tmp;

It’s Solution

char*p;   p=(char*)((intptr_t)p& PAGEOFFSET);2) DWORD_PTR tmp=(DWORD_PTR)malloc(ArraySize);   …  int*ptr=(int*)tmp;

Code written in certain ways will pass through compiler sometimes with no problem, however the mismatch in memory will effect some systems in weird ways or only be noticed at runtime, which on large systems could become very costly to repair. Another issue with even assigning memory in this way with updated values(to utilize the increased size or different sizes between architectures ) will become a problem once we have transitioned 128 bit designs

3. The Solution

So usually casting pointers to integers will just not be advised, this was shown on documentations for ARM and ILP. Another way to fix these problems is to utilize the stdint header file and use their defined cast types and let those handle the memory allocation for you.

Source

http://www.keil.com/support/man/docs/c51/c51_le_genptrs.htm

http://www.unix.org/whitepapers/64bit.html

http://www.keil.com/support/man/docs/c51/c51_ap_2bytescalar.htm

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka9502.html

ftp://gcc.gnu.org/pub/gcc/summit/2003/Porting%20to%2064%20bit.pdf

http://www.viva64.com/en/a/0004/

//Github thread with posts discussing this topic, can be used to further research

http://stackoverflow.com/questions/7146813/when-is-an-integer-pointer-cast-actually-correct

CODE PULLED FROM SOURCES, WILL SHOW EXAMPLES OF PROBLEM

#include <stdio.h>
#include <iostream>
#include <stdint.h>
using namespace std;

int main(){
int testInt, number;
void* testPtr;
int* testPtr2;
char* dest, src;
cout << “Size of Integer: ” << sizeof(testInt) << “\n”;
cout << “Size of Void Pointer: ” << sizeof(testPtr) << “\n”;
cout << “Size of Integer Pointer: ” << sizeof(testPtr2) << “\n”;

//Code Fails on LP64 platform
//Avoid doing this in most situations
testInt = (int)testPtr2;
testPtr2 = (int *)testInt;

//Will work on 32
memcpy((char *)dest, (char *)src, number * sizeof(int));
//Will work on 64
memcpy((char *)dest, (char *)src, number * sizeof(int //or int_ptr
*));
}

Code Review – Lab 1

Analysis of two open-source projects:

Platform:                             Open Broadcaster Software                        Etherpad Lite.

License:                              GNU General Public License                       Apache License

Languages:                          C++ / C / Objective-C                                 JavaScript / Node.Js

Overview:

OBS is an application that lets users stream or record content via screencapture. Useful for  things Such as twitch or a tutorial for Youtube.

Etherpad Lite is an open sourced re-write of Etherpad, a Collaborative real-time editor. Useful for Teamwork as well as live remote instruction.

Contribution:

Etherpad’s main goals of synchronous editing have already been achieved. However after being ported their are still some outlying features and streamlining that have yet to be added. Therefore the main structure of the application should only be reluctantly edited, and additions should be submitted in the form of a plugin. The testing in this form would most likely arise from the rest of the community downloading and critiquing or perhaps even editing the content. With a strong Api and client set, working with several languages is easy. I feel like this approach for contribution would be beneficial for people not used to a strict coding style. Perhaps for people just being introduced to the open source community. Submit a plugin and have some people help you learn through experience. However, discretion must be used to not learn bad lessons by people who are also amateur.

OBS is a developing system as it stands, and a lot of porting has to be accomplished. Much of the work that needs to be done will be in C++ and C, and will focus on version specific bug fixing and streamlining. Combined with a large feature list and the duality of streaming and recording, there is a lot to do. However in this design method they have a more stringent  submission policy, they have a more robust bug tracker, and a community that will help aspiring contributors know what is off limits. Above all however they have a review board, they will read it and make honest critiques. their presence will make it so that several changes to code must be made before it finally gets accepted. I feel like working in this community would be for advanced programmers with a passion for open source, as a project would take a lot of dedication in order for it to be accepted. This method of contribution may discourage newer programmers, and some might take their ideas elsewhere.

– Liam Martin