The wikipedia page for AVL tree is a good source of general information.

A Binary search tree has no mechanism to keep it balanced. If the data being stored is naturally randomized it will generally produce a fairly bushy tree but it is not guaranteed. 

the shorter the tree the faster the search,insert and delete operations, shorter is ALWAYS better
The AVL tree is self balancing to produce the shortest tree possible

As you saw ( or will see) in the performance lab the aditional steps needed to balance the nodes in an AVL tree generally make the insert operations of an AVL tree slower than a BST but the search speed is generally always faster in an AVL tree

lets compare some results of 

Values: 10, 20, 30, 40, 50, 60, 70, 80, 90
BST Height 9
AVL Height 4
Values: 50, 20, 70, 10, 30, 60, 90, 80, 40
This is a special case where the values are in a specific order such that no manipulation was needed so the trees are the same, this is possible for a small set of data but hightly unlikely
BST Height 4
AVL Height 4
Values: 10, 20, 70, 50, 30, 60, 90, 80, 40
BST Height 6
AVL Height 4
Values: 0 to 63 in random order
NOTE:ONLY the last 2 levels of the AVL tree are not fully populated, all levels above the last 2 are 100 full. In the BST tree you will see unused node locations at levels 4,5,6,7,8 & 9
BST: Height 9
AVL: Height 7