More for self-taught programmers

home

If you're a self-taught programmer, skim a good textbook. But you're not going to do that and most textbooks are bad anyway. So what you really need is just a list of some possible holes in what you know. Here's a quick list:

Decimal math

Math with double (or float) can be off by a tiny amount. For example, 1.7 + 2.3 may not be exactly 4.

That won't matter most of the time, since the off-by is super super small. But it's a problem for exact compares. This next thing adds 0.1 over and over, forever, since it never hits 1.0 exactly:

double x=0.1;
while(x!=1.0) x+=0.1f;

When it should logically be 1.0, it will really be 0.999999 or 1.00001. To fix stuff like this, we do stuff like while(x<=0.999).

Minor stuff

One-line if's

To save space, if's can be on one line with no curly-braces:

if(score<0) score=0;

It only works for single statements. As soon as you have 2 or more, you need the curly-braces.

Commenting-out

Instead of deleting old lines, you can keep them around, just in case, by turning them into temporary comments. Below, we want to save the old line until we test the new version:

//n = [old formula] // did this instead of deleting
n = [new formula]

Block scope

Variables declared inside of {}'s are local to just that block. They go away outside of it:

if(cats>10) {
  int extraCats=(cats-dogs)/2; // used in this IF
  ...
}

// extraCats is gone

Modulo

% gives the remainder. 9%4 is 1 (4 goes in twice, with 1 left over). Common uses are if(n%2==0) checks for even numbers (dividing by 2 has remainder 0) and n%10 gives the 1's place.

Precedence

Math operators go in the same order as normal: times and divide before plus and subtract. 3 * 2+5 is 6+5, the same as normal math. < and > naturally go after math, and && and || naturally go after those. People will add extra parens to make it look nice, but they aren't required.

Map/Dictionary

These are a fun way to quickly link strings to other values:

Dictionary<string,double> ToolCost =  Dictionary<string,double>;
ToolCost["hammer"]=6.8;
double costWithTax = ToolCost["wrench"]*1.2f;

Skip these

These are things that the internet might shove at you, but aren't worth the trouble to learn. Some are special-purpose and you won't need them. Others don't do anything you can't already do.

Properties: These are things like public int n { get; set; }. A regular public variable: public int n;, is the same thing but easier.
Funny integer types: There are piles of integer types -- short, long, unsigned, int16 vs. int 32. Just use int.
switch's: These have lots of funny rules, and cascading if's (later) can do everything switches can do, and more.
do-while loops: for and while loops do everything you need.
namespaces: These are for organizing huge multi-person projects. A program with less than 20,000 lines won't need them, and they're easy to add later.

IF's

Cascading if's

When you have a list and want to check for A or B or C ... there's a standard form to do that:

if(n==1) print("tiny");
else if(n==2) print("small");
else if(n<=4) print("medium");
else if(n<=7) print("big");
else print("huge");

It quits when it finds a match, allow easy checking for ranges ("medium" is 3 or 4), and allow an optional "none-of-the-above" at the end.

bool type

bool is a true/false type. The technical name is boolean. It can be used inside of if's, allow you to compute long conditions in steps:

bool is1to10 = n>=1 && n<=10;
bool isWholeNumber = (int)n == n;

if(is1to10 && isWholeNumber) ...

Notice how we didn't need ==true, just if(is1to10) is good enough.

Flags

You may read advice to "use a flag". That's just a global bool. It's called a flag after an old-style mailbox up/down flag. There's nothing special about a global true/false. If you want to remember whether a light is on, an int will work (0=off, 1-on), but bool's can look nicer.

short-circuit

When an if-condition has &&'s or ||'s the computer doesn't just check them all. It goes left-to-right and quits when it knows the answer. This is on purpose to let us guard things at the end. A common use is to prevent out-of-range index errors. This code is safe (the last part, looking up A[i], can't be out-of-range):

if(i>=0 && i<A.length && A[i]==3) // safe

When i is too big or small, it quits immediately. The trick also works for OR's, but in a different way.

Flipping long IF's

Be careful trying to write the opposite of a long IF. It's trickier than it looks because you also have to flip the AND's to OR's:

// between 1 and 10:
if(n>=1 && n<=10)

// not between 1 and 10:
if(n<1 || n>10)

This is called DeMorgan's law if you want to look it up.

not(!)

Explanation point stands for "not". It flips true/false). This checks for words which don't have "rat" in them:

if(!w.Contains("rat")) // words without "rat"

if(w.Contains("rat")==false) // same, but a little longer

It can be is a safe way of flipping something complicated. for example "n is not 2, 7 or 9":

if( !(n==2 || n==7 || n==9) )

Loops

foreach is a shortcut

A foreach loop is great for quickly looking through an array, but whenever you're stuck trying to do something tricky with a foreach, go back to a regular old index loop. Suppose we need to skip the first item. A for-loop can simply start at 1:

// check everything _except the first item_:
for(int i=1; i<A.length; i++) ...
//        ^ one, not zero

while(!done)

For really strange loops, a done boolean can often help. The template looks like this:

bool done=false;
while(!done) {

  if( [something] ) done=true;
  if( [something else] ) done=true;
}

That lets you assemble when to quit over as many lines as you want, instead of having to cram it all into the while parens.

break; continue;

break; instantly quits a loop. It's good for "find one thing in a list" loops. continue; skips to the end of the loop, but then keeps looping. It's good for "don't count this one, but keep going".

while( [something] ) {
  ...
  if( [something else] ) continue; // go straight to next item
  if( [a third thing] ) { [do some stuff]; break; } // quit the loop
  ...

while(true)

This is pretty much the same as the while(!done) loop, except using a break;. It can handle the really weird loop problems:

while(true) { // never quits? But quits w/break, inside
  if( [out of options] ) break;
  ...
  if( [found item] ) break;
  ...
}

An advantage is how break quits immediately, which can be nicer than if(...) done=true; else.

classes

Plain old data classes

You might think of classes as being all fancy, but you can create them to hold simple public variables. What often happens is you first create regular variables:

// 1st attempt, names of 3 people:
string fname1, lName1;
string fname2, lName2;
string fname3, lName3;

You'll be making lots of them, and first and last name always go together, so a class is a nice shortcut:

class fullName {
  public string first;
  public string last;
}

name1 = new fullName(); // creates first and last, glued together
name2 = new fullName(); // ...and so on

Convenience functions

A regular data class often gets a helpful function or two. Maybe we want an easy way to set both parts and a quicker way to show it as "Smith, John":

class fullName {
  public string first, last;

  public void set(string fName, lName) { // might be useful
    first=fName;
    last=lName;
  }

  public string asString() { return last+", "+first; }
}

It's still not some big fancy class -- it's just n.first and n.last with 2 shortcuts. name1.set("Gar", "Reblo"); is a little nicer than name1.first="Gar"; ... .

Arrays

Lists of variables

You might think of arrays as lists of words or perset values, but they can be a list of variables. For example, say we have 6 variables to record dice rolls:

// number of times we rolled that number:
int d1, d2, d3, d4, d5, d6;

We can make the same thing (6 int variables) with one size-6 array:

int[] D=new int[6]; // 6 integer variables

We can set them all to 0 with a loop:

for(int i=0;i<6;i++) D[i]=0; // reset all to 0

Being able to use 0-5 to "look-up" a box is great. Suppose n is the number we just rolled. D[n-1]++; adds 1 to the it's total (in other words, if we roll a 2, that short line adds to the 2's count). Without an array, we'd be stuck using 6 if's.

parallel arrays

Sometimes it naturally happens you have several arrays representing a list of multi-part items. Suppose cats have name, age and weight. We could make 20 cats like this:

string[] catNames = new string[20];
int[] catAges = new int[20];
double[] catWeights = new double[20];

That works, cat#7 is spread out among the slot#7's of the name, age and weight arrays. But it might be nicer to make a Cat class and have one array of that:

Cat[] = new Cat[20];
for(int i=0;i<20;i++) Cat[i]=new Cat();

Cat[3].age=5; // using it

Array-backed List class

The List class is better than an array for most things, since it can grow at the end and can delete items. But it's literally an array -- it secretely uses one. It's declared like List<int> instead of int[]

namespaces, dots, static

For organization, built-ins are divided among folders, called namespaces. System is a main one. Inside it are namespaces IO, for files and Random and so on. To find the file-reading class we use System.IO.FileStream. So far, so good.

Dot-reuse: c1.claws.sharpness uses dots to look inside of object c1. Dots are used with classes. But dots are also used with namespaces. Hmmm... . The two things are different, and it's a little confusing we use the same symbol for both things.
using's: Having to write System.IO every time is a pain. using System.IO; adds a shortcut to leave it out. It's common to have tons of using's at the top of a program. Nothing bad happens if you have ones you don't need.
3rd party namespaces: Most plug-ins and API's also put their stuff into namespaces. That can look funny, but works the same as normal. The entire package may be in namespace BugMachine and you may need to use BugMachine.Useful.Bee to get the bee class. As usual, you can add using BugMachine; to save typing later.
static classes: C# sometimes uses classes as namespaces. For example, System.IO.Directory holds normal functions for playing around with files, but Directory says it's a class! The thing is, it's not a class. It's marked as static, which means it counts as a namespace.
static class functions: Even real classes can be used as namespaces. The manual will have a heading static functions if it is. Suppose getRandomCat is static in the Cat class. You'd call it like Cat c1=Cat.GetRandomCat();, using Cat as a namespace. To compare, Cat.claws is an error if claws isn't static -- you'd need to have a real cat.

To sum up: remember that dots can be class member-dots, or they could be namespace path-dots. And that some functions in a class are just normal if you see static by them.

Efficiency

Program speed-up tricks don't do any good. Most don't work, some make the program slower, and the rest are a super-tiny speed-up. You're basically wasting time and causing bugs for no reason.

For example, it seems as if you could take lots of little steps and combine them into a single, faster equation. Nope. The compiler uncombines them. It even adds back temporary variables which you eliminated. Attempts at little speed-ups are called micro-optimizations if you want to look it up.

The one place to worry about speed is avoiding extra loops. Take this sample code:

foreach(f in Frogs) {
  Frog bestFrog = findBestFrog();
  if( [something with bestFrog ) ...
  ...
}

The problem is that findBestFrog is probably a loop through every Frog. It's a nested loop. With 1,000 frogs we have a million steps total (1,000 times 1,000). If we know the best frog doesn't change here then we could compute it ahead-of-time, removing the nested loop for a big speed-up:

Frog bestFrog = findBestFrog();

for(each(f in Frogs) {
  if( [something with bestFrog] ) ...
  ...
}

Big-O: built-in functions (usually for arrays) tell you their nested loops using "big-O" notation. Roughly: O(1) means no loops, O(n) means a loop, and O(n^2) means a nested loop.

Sometimes this can be useful. For example, removing from the front of a List is O(n) -- a loop -- but removing from the back is only O(1) -- not a loop. If you can turn your front-removes into back-removes, you get a big speed-up.

Linked-list class: a linked-list does the same thing as an array (or a List). It just holds a list of items. But it works differently, making it a lot faster for some special things.

In a linked-list, each item has a link to the items that come before and after. This makes it fast to add or delete from anywhere; but takes extra space and makes it much longer to just jump to any box. For super-intense code on a big list where you'll mostly add and remove from all over, a linked-list can be a huge speed-up. But otherwise, just use an array or List.

Pointers

References are used in 2 different ways, which isn't really explained. 95% of the time they're just variables. Here c1 and c2 are just cats:

Cat c1 = new Cat(); c1.name="Ally";
Cat c2 = new Cat(); c2.name="Bear";

But we also use them as "pointers". In our minds, activeCat isn't a cat. It's job is to point to some other cat:

// activeCat will point to some real cat:
Cat activeCat = c1;
if( [something] ) activeCat=2;

Here's an example of the same thing with arrays. AllDogs is an array of 20 real dogs:

Dog[] AllDogs = new Dog[20];
for(int i=0;i<20;i++) AllDogs[i]=new Dog();

We'll select from these for 6-dog sled teams. SledTeam1 is an array of 6 pointers to dogs in the AllDogs array:

// these will only point to things in AllDogs:
Dog[] SledTeam1 = new Dog[6];

// no dogs on the team, yet. Set pointers to null:
for(int i=0;i<20;i++) AllDogs[i]=null;

// choose some dogs onto the team:
SledTeam1[0]=AllDogs[3];
SledTeam1[1]=AllDogs[16];

You can't tell from how they're declared -- both are Dog[] dog-arrays. You have to figure out real-or-pointer? from how they're used.

Structs

C# has a value-type version of classes, named struct. Classes do everything you need, and work better. You never need to create a struct, but you may have to work with them.

You can never have a reference to a struct. Assignment statements make copies, the same as int's or strings. w2=w1; copies. w2 is still a different thing than w1.