ECE 250. Algorithms and Data Structures. Lists

Содержание

Слайд 2

Outline We will now look at our first abstract data structure

Outline

We will now look at our first abstract data structure
Relation: explicit

linear ordering
Operations
Implementations of an abstract list with:
Linked lists
Arrays
Memory requirements
Strings as a special case
The STL vector class
Слайд 3

Definition An Abstract List (or List ADT) is linearly ordered data

Definition

An Abstract List (or List ADT) is linearly ordered data where

the programmer explicitly defines the ordering
We will look at the most common operations that are usually
The most obvious implementation is to use either an array or linked list
These are, however, not always the most optimal

3.1

Слайд 4

Operations Operations at the kth entry of the list include: Access

Operations

Operations at the kth entry of the list include:
Access to

the object Erasing an object
Insertion of a new object Replacement of the object

3.1.1

Слайд 5

Operations Given access to the kth object, gain access to either

Operations

Given access to the kth object, gain access to either the

previous or next object
Given two abstract lists, we may want to
Concatenate the two lists
Determine if one is a sub-list of the other

3.1.1

Слайд 6

Locations and run times The most obvious data structures for implementing

Locations and run times

The most obvious data structures for implementing an

abstract list are arrays and linked lists
We will review the run time operations on these structures
We will consider the amount of time required to perform actions such as finding, inserting new entries before or after, or erasing entries at
the first location (the front)
an arbitrary (kth) location
the last location (the back or nth)
The run times will be Θ(1), O(n) or Θ(n)

3.1.2

Слайд 7

Linked lists We will consider these for Singly linked lists Doubly linked lists 3.1.3

Linked lists

We will consider these for
Singly linked lists
Doubly linked lists

3.1.3

Слайд 8

Singly linked list 3.1.3.1 * These assume we have already accessed the kth entry—an O(n) operation

Singly linked list

3.1.3.1

* These assume we have already accessed the kth

entry—an O(n) operation
Слайд 9

Singly linked list 3.1.3.1 By replacing the value in the node

Singly linked list

3.1.3.1

By replacing the value in the node in question,

we can speed things up
– useful for interviews
Слайд 10

Doubly linked lists 3.1.3.2 * These assume we have already accessed the kth entry—an O(n) operation

Doubly linked lists

3.1.3.2

* These assume we have already accessed the kth

entry—an O(n) operation
Слайд 11

Doubly linked lists 3.1.3.2 Accessing the kth entry is O(n)

Doubly linked lists

3.1.3.2

Accessing the kth entry is O(n)

Слайд 12

Other operations on linked lists 3.1.3.3 Other operations on linked lists

Other operations on linked lists

3.1.3.3

Other operations on linked lists include:
Allocation and

deallocating the memory requires Θ(n) time
Concatenating two linked lists can be done in Θ(1)
This requires a tail pointer
Слайд 13

Arrays 3.1.4 We will consider these operations for arrays, including: Standard or one-ended arrays Two-ended arrays

Arrays

3.1.4

We will consider these operations for arrays, including:
Standard or one-ended arrays
Two-ended

arrays
Слайд 14

Standard arrays 3.1.4 We will consider these operations for arrays, including:

Standard arrays

3.1.4

We will consider these operations for arrays, including:
Standard or one-ended

arrays
Two-ended arrays
Слайд 15

Run times * Assume we have a pointer to this node

Run times

* Assume we have a pointer to this node

Слайд 16

Data Structures In general, we will only use these basic data

Data Structures

In general, we will only use these basic data structures

if we can restrict ourselves to operations that execute in Θ(1) time, as the only alternative is O(n) or Θ(n)
Interview question: in a singly linked list, can you speed up the two O(n) operations of
Inserting before an arbitrary node?
Erasing any node that is not the last node?
If you can replace the contents of a node, the answer is “yes”
Replace the contents of the current node with the new entry and insert after the current node
Copy the contents of the next node into the current node and erase the next node
Слайд 17

Memory usage versus run times All of these data structures require

Memory usage versus run times

All of these data structures require Θ(n)

memory
Using a two-ended array requires one more member variable, Θ(1), in order to significantly speed up certain operations
Using a doubly linked list, however, required Θ(n) additional memory to speed up other operations
Слайд 18

Memory usage versus run times As well as determining run times,

Memory usage versus run times

As well as determining run times, we

are also interested in memory usage
In general, there is an interesting relationship between memory and time efficiency
For a data structure/algorithm:
Improving the run time usually requires more memory
Reducing the required memory usually requires more run time
Слайд 19

Memory usage versus run times Warning: programmers often mistake this to

Memory usage versus run times

Warning: programmers often mistake this to suggest

that given any solution to a problem, any solution which may be faster must require more memory
This guideline not true in general: there may be different data structures and/or algorithms which are both faster and require less memory
This requires thought and research
Слайд 20

The sizeof Operator In order to determine memory usage, we must

The sizeof Operator

In order to determine memory usage, we must know

the memory usage of the various built-in data types and classes
The sizeof operator in C++ returns the number of bytes occupied by a data type
This value is determined at compile time
It is not a function
Слайд 21

The sizeof Operator #include using namespace std; int main() { cout

The sizeof Operator

#include
using namespace std;
int main() {
cout << "bool

" << sizeof( bool ) << endl;
cout << "char " << sizeof( char ) << endl;
cout << "short " << sizeof( short ) << endl;
cout << "int " << sizeof( int ) << endl;
cout << "char * " << sizeof( char * ) << endl; cout << "int * " << sizeof( int * ) << endl;
cout << "double " << sizeof( double ) << endl;
cout << "int[10] " << sizeof( int[10] ) << endl;
return 0;
}

{eceunix:1} ./a.out # output
bool 1
char 1
short 2
int 4
char * 4
int * 4
double 8
int[10] 40
{eceunix:2}

Слайд 22

Abstract Strings A specialization of an Abstract List is an Abstract

Abstract Strings

A specialization of an Abstract List is an Abstract String:
The

entries are restricted to characters from a finite alphabet
This includes regular strings “Hello world!”
The restriction using an alphabet emphasizes specific operations that would seldom be used otherwise
Substrings, matching substrings, string concatenations
It also allows more efficient implementations
String searching/matching algorithms
Regular expressions
Слайд 23

Abstract Strings Strings also include DNA The alphabet has 4 characters:

Abstract Strings

Strings also include DNA
The alphabet has 4 characters: A, C,

G, and T
These are the nucleobases:
adenine, cytosine, guanine, and thymine
Bioinformatics today uses many of the algorithms traditionally restricted to computer science:
Dan Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge, 1997
http://books.google.ca/books?id=STGlsyqtjYMC
References:
http://en.wikipedia.org/wiki/DNA
http://en.wikipedia.org/wiki/Bioinformatics
Слайд 24

Standard Template Library In this course, you must understand each data

Standard Template Library

In this course, you must understand each data structure

and their associated algorithms
In industry, you will use other implementations of these structures
The C++ Standard Template Library (STL) has an implementation of the vector data structure
Excellent reference:
http://www.cplusplus.com/reference/stl/vector/
Слайд 25

Standard Template Library #include #include using namespace std; int main() {

Standard Template Library

#include
#include
using namespace std;
int main() {
vector v(

10, 0 );
cout << "Is the vector empty? " << v.empty() << endl;
cout << "Size of vector: " << v.size() << endl;
v[0] = 42;
v[9] = 91;
for ( int k = 0; k < 10; ++k ) {
cout << "v[" << k << "] = " << v[k] << endl;
}
return 0;
}

$ g++ vec.cpp
$ ./a.out
Is the vector empty? 0
Size of vector: 10
v[0] = 42
v[1] = 0
v[2] = 0
v[3] = 0
v[4] = 0
v[5] = 0
v[6] = 0
v[7] = 0
v[8] = 0
v[9] = 91
$

Слайд 26

Summary In this topic, we have introduced Abstract Lists Explicit linear

Summary

In this topic, we have introduced Abstract Lists
Explicit linear orderings
Implementable with

arrays or linked lists
Each has their limitations
Introduced modifications to reduce run times down to Θ(1)
Discussed memory usage and the sizeof operator
Looked at the String ADT
Looked at the vector class in the STL
Слайд 27

References [1] Donald E. Knuth, The Art of Computer Programming, Volume

References

[1] Donald E. Knuth, The Art of Computer Programming, Volume 1: Fundamental

Algorithms, 3rd Ed., Addison Wesley, 1997, §2.2.1, p.238.
[2] Weiss, Data Structures and Algorithm Analysis in C++, 3rd Ed., Addison Wesley, §3.3.1, p.75.