Main Content

Copying a reference, or cloning

Archive - Originally posted on "The Horse's Mouth" - 2006-03-05 07:12:24 - Graham Ellis

If you copy a variable in a program, you end up with a duplicate, right?
set second $first ... in Tcl
second=$first ... in shell
second = first ... in Python, Ruby, C and Java
$second = $first ... in Perl and PHP

Well - ALMOST right. For sure an assignment copies a variable, but where that variable is a reference you may be doing no more that assigning a second name to the same underlying data. For example, if I had a python object called charlie and I wrote thecat = charlie, then any subsequent changes that I made to charlie would also be reflected in data accessed via thecat and vice versa. When you think about it, this is a natural way of doing things and also efficient in that it does not duplicate data but it's quite a tough idea for programmers from a structured programming environment.

There are actually three levels of copying a variable in Python.

If you do a straight forward assignment, you're copying the reference - in other words, you're giving the variable a new name or an alias and content changes under either name will be reflected in the same objects. Our charlie and thecat is a good example of where you might do something like this.

If you use the deepcopy function, then you'll be cloning an object and you'll duplicate its content, so that any subsequent amendment to the original or the copy will be unique to the original or to the copy. If you had an object that contained this year's data and you wanted a new object for next year's data too, you might do a deep copy ... then change next year's data to reflect the event you are expecting to happen. Next year's data changes but this year's remains unaltered.

The third (intermediate) level of copying in Python uses a list slice notation and is described as a shallow copy. If you have a table of staff members and you shallow copy it, you'll end up with a new table of staff members. Take the copy, add in some extra staff, take some away, and only the copy is effected. However, each staff member object refers to the same data from both the original and new data sets, so if you change a property of an individual staff member - say their home address is altered - then that will change in both "copy"s. Once again, in the correct circumstance this is a very natural way to handle data.

Here are the three syntaxes - deep copy (i.e. clone), shallow copy and alias:
firstyear = deepcopy(team)
secondyear = team[:]
thirdyear = team


I've written these three examples into an example program available under the More on Collections and Sequences in Python section of our resource centre ... here are the results of running the program to see the effect of changes in firstyear, secondyear and thirdyear variables when the team gets changed.

The changes made:
team[2] = person("Charlotte","Cat's home entertainer",10)
team[1].setage(53)


and a listing of the various teams ...

Original team
Lisa
Trained as Graphic Designer
Aged 21
Graham
Trained as Support Manager
Aged 51
Charlie
Trained as Unknown
Aged 9

Team after changes
Lisa
Trained as Graphic Designer
Aged 21
Graham
Trained as Support Manager
Aged 53
Charlotte
Trained as Cat's home entertainer
Aged 10

Deep copy - no changes from original
Lisa
Trained as Graphic Designer
Aged 21
Graham
Trained as Support Manager
Aged 51
Charlie
Trained as Unknown
Aged 9

Shallow copy - some changes
Lisa
Trained as Graphic Designer
Aged 21
Graham
Trained as Support Manager
Aged 53
Charlie
Trained as Unknown
Aged 9

Normal copy (alias) - all changes shown
Lisa
Trained as Graphic Designer
Aged 21
Graham
Trained as Support Manager
Aged 53
Charlotte
Trained as Cat's home entertainer
Aged 10

Source code - here