In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is
a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack
tree where insert/delete takes O(lg n) time. Or are there some other
explanation for this complexity? 33 4927
desktop <ff*@sss.comwrites:
In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q
is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a
redblack tree where insert/delete takes O(lg n) time. Or are there
some other explanation for this complexity?
IIRC insert takes log n time, since you have to search for the right
place to insert it.
"Remove this value" also takes log n time, since you have to search
for the thing to delete.
In this case, you already know what thing to delete, since you've
got an iterator to it. No searching required, just fiddling with
some internal bookkeeping.

Dave Steffen, Ph.D. Disobey this command!
Software Engineer IV  Douglas Hofstadter
Numerica Corporation
dgAsteffen aAt numerica dAot us (remove A's to email me)
On 6/12/07 4:00 PM, in article f4**********@news.net.unic.dk, "desktop"
<ff*@sss.comwrote:
In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is
a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack
tree where insert/delete takes O(lg n) time. Or are there some other
explanation for this complexity?
The explanation is simple. The removal time of a node when measured for an
RB tree includes the time needed to find the node to be deleted in the tree.
In this case of a call to set::erase(), no time needs to be spent searching
for the node (that is, the item) to be removed because its location is
passed to the erase method as a parameter. So by skipping the search for the
item, the item is able to be removed from the set in amortized constant (and
not logarithmic) time.
Greg
Dave Steffen wrote:
desktop <ff*@sss.comwrites:
>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
IIRC insert takes log n time, since you have to search for the right
place to insert it.
"Remove this value" also takes log n time, since you have to search
for the thing to delete.
In this case, you already know what thing to delete, since you've
got an iterator to it. No searching required, just fiddling with
some internal bookkeeping.
But you still need to do the following re balancing that can take O(lg
n) time
On 6/12/07 4:26 PM, in article f4**********@news.net.unic.dk, "desktop"
<ff*@sss.comwrote:
Dave Steffen wrote:
>desktop <ff*@sss.comwrites:
>>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
IIRC insert takes log n time, since you have to search for the right place to insert it.
"Remove this value" also takes log n time, since you have to search for the thing to delete.
In this case, you already know what thing to delete, since you've got an iterator to it. No searching required, just fiddling with some internal bookkeeping.
But you still need to do the following re balancing that can take O(lg
n) time
But the amortized time for rebalancing an RB tree is O(1)  in another
words, a constant amount of time. So the C++ Standard's performance
requirements for deleting an item from a set can be met by implementing the
set with an RB tree.
Greg
Greg Herlihy wrote:
>
On 6/12/07 4:26 PM, in article f4**********@news.net.unic.dk, "desktop"
<ff*@sss.comwrote:
>Dave Steffen wrote:
>>desktop <ff*@sss.comwrites:
In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity? IIRC insert takes log n time, since you have to search for the right place to insert it.
"Remove this value" also takes log n time, since you have to search for the thing to delete.
In this case, you already know what thing to delete, since you've got an iterator to it. No searching required, just fiddling with some internal bookkeeping.
But you still need to do the following re balancing that can take O(lg n) time
But the amortized time for rebalancing an RB tree is O(1)  in another
words, a constant amount of time. So the C++ Standard's performance
requirements for deleting an item from a set can be met by implementing the
set with an RB tree.
Greg
The delete version in In Introduction To Algorithms by Thomas Cormen is
the version that takes a pointer to an element (not a key that first has
to be found).
But I can't see how you can avoid the O (lg n ) time since the
subroutine 'treesuccessor' has a running time equal to the height of
the tree.
Greg Herlihy wrote:
In this case of a call to set::erase(), no time needs to be spent searching
for the node (that is, the item) to be removed because its location is
passed to the erase method as a parameter. So by skipping the search for the
item, the item is able to be removed from the set in amortized constant (and
not logarithmic) time.
And the algorithm somehow manages to rebalance the tree in amortized
constant time?
On Jun 13, 1:00 am, desktop <f...@sss.comwrote:
In the C++ standard sec 23.1.2 table 69 it says that erase(q)
where q is a pointer to an element can be done in amortized
constant time.
No it doesn't. It says that the complexity is "amortized
constant". And in §21.1/2, it says clearly that "All of the
complexity requirements in this clause are stated solely in
terms of the number of operations on the contained objects."
I would expect that erase(q) requires one call to the destructor
of the object, and that is it. Which definitely makes it O(1);
I don't even know why there is an "amortized" in there.
The standard never makes any requirements with regards to time.
In a very real way, it can't; there are too many variables
involved that are beyond the control of the implementation.

James Kanze (GABI Software, from CAI) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.Cyrl'École, France, +33 (0)1 30 23 00 34
James Kanze wrote:
On Jun 13, 1:00 am, desktop <f...@sss.comwrote:
>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
No it doesn't. It says that the complexity is "amortized
constant". And in §21.1/2, it says clearly that "All of the
complexity requirements in this clause are stated solely in
terms of the number of operations on the contained objects."
I would expect that erase(q) requires one call to the destructor
of the object, and that is it. Which definitely makes it O(1);
I don't even know why there is an "amortized" in there.
Ok so when erase(p) is said to be amortized constant they exclude the
time used to re balance the tree which would result in logarithmic time.
desktop wrote:
In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is
a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack
tree where insert/delete takes O(lg n) time. Or are there some other
explanation for this complexity?
You don't read carefully enough.
According to the standard the function erase(T value) has log(N)
complexity where N is the number of elements. This function requires
searching.
And the function erase(iterator it) has constant complexity because
it doesn't require searching.

it's mail not fail
V.R. Marinov wrote:
desktop wrote:
>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
You don't read carefully enough.
According to the standard the function erase(T value) has log(N)
complexity where N is the number of elements. This function requires
searching.
And the function erase(iterator it) has constant complexity because
it doesn't require searching.
You don't read carefully enough. The OP argues that the rebalancing of
the tree takes O(log n) time, so, despite of the search, the complexity
would be O(log n) just for the rebalancing.
Regards,
Zeppe
Zeppe wrote:
V.R. Marinov wrote:
>desktop wrote:
>>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
You don't read carefully enough. According to the standard the function erase(T value) has log(N) complexity where N is the number of elements. This function requires searching.
And the function erase(iterator it) has constant complexity because it doesn't require searching.
You don't read carefully enough. The OP argues that the rebalancing of
the tree takes O(log n) time, so, despite of the search, the complexity
would be O(log n) just for the rebalancing.
I guess I was trying to say that rebalancing doesn't take O(lg(N))
complexity.

it's mail not fail
V.R. Marinov wrote:
desktop wrote:
>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
You don't read carefully enough.
According to the standard the function erase(T value) has log(N)
complexity where N is the number of elements. This function requires
searching.
And the function erase(iterator it) has constant complexity because
it doesn't require searching.
As James Kanze pointed out the constant complexity is based on only
contained objects and not the extra time needed to do re balancing which
leads to a O (lg n) bound.
desktop wrote:
James Kanze wrote:
>On Jun 13, 1:00 am, desktop <f...@sss.comwrote:
>>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
No it doesn't. It says that the complexity is "amortized constant". And in §21.1/2, it says clearly that "All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects." I would expect that erase(q) requires one call to the destructor of the object, and that is it. Which definitely makes it O(1); I don't even know why there is an "amortized" in there.
Ok so when erase(p) is said to be amortized constant they exclude the
time used to re balance the tree which would result in logarithmic time.
Not at all.
1) the sentence in 23.1/2 refers to the fact that the complexity is
independent on the type of object that is stored in the container. The
complexity is calculated in terms of operations on the objects, exactly
how normally it's calculated in the computer science books.
2) I don't have any book on data structures, but wikipedia tells that
the rebalancing takes O(log N) or amortized O(1) time. So, it's
reasonable to say that if you do not need to search but just to
rebalance, the required time is amortized O(1). Amortized O(1) is not
the same of O(1): the rebalancing IS O(log N) in the worst case, but
there is the guarantee that the worst case can't happen too often, so in
average it's O(1). More accurately, it means that if you have to erase a
sequence, this will be done in O(n) (not O(n log n)) time. The standard
apparently gives only this "weak" requirement.
3) having said so, I would suggest you to try to read carefully
something about these data structures if you are interest in them,
because it seems like you are trying to understand pretty complex and
theoretical stuff having done just a quick look trough some book. The
questions you are asking for are quite difficult, and almost anybody
here is proficient with the theory of RB trees (that are not strictly
related with c++ either).
Regards,
Zeppe
Zeppe wrote:
desktop wrote:
>James Kanze wrote:
>>On Jun 13, 1:00 am, desktop <f...@sss.comwrote: In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
No it doesn't. It says that the complexity is "amortized constant". And in §21.1/2, it says clearly that "All of the complexity requirements in this clause are stated solely in terms of the number of operations on the contained objects." I would expect that erase(q) requires one call to the destructor of the object, and that is it. Which definitely makes it O(1); I don't even know why there is an "amortized" in there. Ok so when erase(p) is said to be amortized constant they exclude the time used to re balance the tree which would result in logarithmic time.
Not at all.
1) the sentence in 23.1/2 refers to the fact that the complexity is
independent on the type of object that is stored in the container. The
complexity is calculated in terms of operations on the objects, exactly
how normally it's calculated in the computer science books.
2) I don't have any book on data structures, but wikipedia tells that
the rebalancing takes O(log N) or amortized O(1) time. So, it's
reasonable to say that if you do not need to search but just to
rebalance, the required time is amortized O(1). Amortized O(1) is not
the same of O(1): the rebalancing IS O(log N) in the worst case, but
there is the guarantee that the worst case can't happen too often, so in
average it's O(1). More accurately, it means that if you have to erase a
sequence, this will be done in O(n) (not O(n log n)) time. The standard
apparently gives only this "weak" requirement.
But delete not only uses re balancing, it also uses treesuccessor which
have a running time equal to the height of the tree. This subroutine is
used in the version of delete that takes a pointer to the element thats
going to be deleted.
3) having said so, I would suggest you to try to read carefully
something about these data structures if you are interest in them,
because it seems like you are trying to understand pretty complex and
theoretical stuff having done just a quick look trough some book. The
questions you are asking for are quite difficult, and almost anybody
here is proficient with the theory of RB trees (that are not strictly
related with c++ either).
The literature I read don't cover these special cases or uses of the
red/black tree and the basics they cover are not very explanatory (don't
think you can find a book covering the requirements of the operations in
the C++ standard in this kind of detail) but I agree that its getting
OffTopic and I will make sure to post theoretical questions in a more
appropriate group.
On Jun 13, 3:15 pm, desktop <f...@sss.comwrote:
James Kanze wrote:
On Jun 13, 1:00 am, desktop <f...@sss.comwrote:
In the C++ standard sec 23.1.2 table 69 it says that erase(q)
where q is a pointer to an element can be done in amortized
constant time.
No it doesn't. It says that the complexity is "amortized
constant". And in §21.1/2, it says clearly that "All of the
complexity requirements in this clause are stated solely in
terms of the number of operations on the contained objects."
I would expect that erase(q) requires one call to the destructor
of the object, and that is it. Which definitely makes it O(1);
I don't even know why there is an "amortized" in there.
Ok so when erase(p) is said to be amortized constant they
exclude the time used to re balance the tree which would
result in logarithmic time.
Essentially, yes. They also exclude the time to deallocate the
node; the time necessary for deallocation could depend on the
number of individual blocks allocated. They also exclude any
time necessary for paging, due to poor locality or whatever,
which might also depend (indirectly) on the size of the set. I
imagine that most people would agree that these last two are
reasonable to exclude; there's no way to really say much about
them anyway. But that means that the complexity guarantees
can't be measured in time. And the results of measuring them in
operations on the contained object results in the time to
rebalance the tree being excluded as well.
Whether this is significant depends on the object type. If
comparison or destruction are complicated operations, the few
pointer operations necessary for rebalancing won't be
significant, even if the tree is 20 deep, and you hit the worst
case. If the set only contains int's, and you use a blocked
fixed size allocator which can deallocate in with only one or
two pointer manipulations, the rebalancing is a major part of
the job.
So I'm not saying that it's good or bad, the way it's specified.
I'm just saying that that's how it's specified.

James Kanze (Gabi Software) email: ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.Cyrl'École, France, +33 (0)1 30 23 00 34
On 6/13/07 5:14 AM, in article 46**********************@news.song.fi, "Juha
Nieminen" <no****@thanks.invalidwrote:
Greg Herlihy wrote:
>In this case of a call to set::erase(), no time needs to be spent searching for the node (that is, the item) to be removed because its location is passed to the erase method as a parameter. So by skipping the search for the item, the item is able to be removed from the set in amortized constant (and not logarithmic) time.
And the algorithm somehow manages to rebalance the tree in amortized
constant time?
Yes.
And if you would like to know exactly how a RB tree is rebalanced, you can
look up the relevant entry in Wikipedia as a starting point.
Greg
V.R. Marinov wrote:
I guess I was trying to say that rebalancing doesn't take O(lg(N))
complexity.
Note that O() and "amortized time" are different things. O() denotes
the *worstcase* scenario of one single operation. Even if rebalancing
the tree after a removal could be done in *amortized* constant time,
it could still be O(lg(N)).
Granted, I'm not acquainted with the exact algorithm used for
rebalancing the tree, but I'm pretty sure it takes lg(N) time in
the worst case.
It's the same thing as with the tree traversal: Even though a whole
traversal can be done in linear time (iow. the whole traversal takes
O(N) steps), one single iterator increment is O(lg(N)) (because there
are cases where it takes that many operations).
Juha Nieminen wrote:
V.R. Marinov wrote:
>I guess I was trying to say that rebalancing doesn't take O(lg(N)) complexity.
Note that O() and "amortized time" are different things. O() denotes
the *worstcase* scenario of one single operation. Even if rebalancing
the tree after a removal could be done in *amortized* constant time,
it could still be O(lg(N)).
Granted, I'm not acquainted with the exact algorithm used for
rebalancing the tree, but I'm pretty sure it takes lg(N) time in
the worst case.
It's the same thing as with the tree traversal: Even though a whole
traversal can be done in linear time (iow. the whole traversal takes
O(N) steps), one single iterator increment is O(lg(N)) (because there
are cases where it takes that many operations).
The treesuccessor subroutine in delete (before the call to rebalance)
also takes O(lg n) time so this must dominate delete even though you can
do re balancing in amortized constant time.
On 6/14/07 3:44 AM, in article 46**********************@news.song.fi, "Juha
Nieminen" <no****@thanks.invalidwrote:
V.R. Marinov wrote:
>I guess I was trying to say that rebalancing doesn't take O(lg(N)) complexity.
Note that O() and "amortized time" are different things. O() denotes
the *worstcase* scenario of one single operation. Even if rebalancing
the tree after a removal could be done in *amortized* constant time,
it could still be O(lg(N)).
Not really. Remember that deleting an item must require no more than an
amortized constant number of operations: therefore, if erasing an item from
a set takes O(log(N)) operations (say, to rebalance the underlying RB
tree), then it must be the case that none of the preceding (or following)
items erased could have required more than a constant number of operations
to complete. In other words, in order to achieve a constant amortized
operation count, there must be a sharp limit on how often a "worst case" may
occur relative to the number of "best cases" that occur when erasing an item
from a set.
So, unlike, say, searching for an item in a set (in which every search could
require a "worst case" or O(log(N)) number of operations), erasing an item
from a set may require O(log(N)) operations only rarely (I estimate around 1
in e^N items erased could turn out to be a worst case).
A clearer example would be a std::vector that doubles its capacity whenever
its size is about to exceed its current capacity. Now, even though inserting
an item into this vector could require a (worst case) N number of operations
 to allocate the added capacity  it should be clear that not every
insertion could be a worst case. In fact the number of such worst cases is
limited to 1 per every 2^N insertions. So in this case, even though the
worst case has linear complexity, the average case still has a constant
complexity, because as the worst cases become worse  they also become less
frequent by the same degree.
Granted, I'm not acquainted with the exact algorithm used for
rebalancing the tree, but I'm pretty sure it takes lg(N) time in
the worst case.
Yes, but the worst cases occur less and less frequently and are therefore
always compensated by an everincreasing number of constant "best cases". So
the average complexity per deletion when the best cases are included with
the worst cases  is still a constant.
It's the same thing as with the tree traversal: Even though a whole
traversal can be done in linear time (iow. the whole traversal takes
O(N) steps), one single iterator increment is O(lg(N)) (because there
are cases where it takes that many operations).
Your explanation is entirely backwards. First, it seems that you must be
describing the number of steps needed to find a specified item in a
container. Second, Linear complexity, O(N), is much worse than logarithmic
complexity, O(log(N)) So O(N) is the expected worst case to find an item in
certain unsorted containers (such as a vector). To find an item in a set on
the other hand, has only a logarithmic worstcase complexity O(log(N))  a
significant improvement over a vector.
In any event, searching in a set is not comparable to erasing or adding an
item (at a known location)  since, when searching there is no limit on the
frequency of "worst" cases searches versus "best" case searches  and in
fact every search could be a worst case  so its complexity has to be
reported with the most pessimistic assumption  just in case.
Greg
Greg Herlihy wrote:
So, unlike, say, searching for an item in a set (in which every search could
require a "worst case" or O(log(N)) number of operations), erasing an item
from a set may require O(log(N)) operations only rarely (I estimate around 1
in e^N items erased could turn out to be a worst case).
I think you don't really understand the O() notation. The O()
notation expresses the asymptotic *upper limit* for the time required
to perform the specified operation. Or in other words, it describes
the *worst case*, ie. the largest amount of time the operation could
possibly take. It doesn't matter if this happens rarely. O() doesn't
express the average time nor the amortized time for that operation.
Even if erasing the element can be done in constant time one
billion times for each lg(n)time erase, erasing the element is
still an O(lg(n))time algorithm. As I said, O() expresses the
absolute worst case scenario.
"Amortized time" is a completely different thing. It says that if
you perform the operation a very large amount of times over a typical
data set, how much *in average* it will take time.
A clearer example would be a std::vector that doubles its capacity whenever
its size is about to exceed its current capacity. Now, even though inserting
an item into this vector could require a (worst case) N number of operations
 to allocate the added capacity  it should be clear that not every
insertion could be a worst case.
Insertion into a vector *is* an O(n) operation. It doesn't matter if
the operation is constanttime when amortized over a large amount of
insertions. O() expresses the worst possible case of one operation.
In fact the number of such worst cases is
limited to 1 per every 2^N insertions. So in this case, even though the
worst case has linear complexity, the average case still has a constant
complexity, because as the worst cases become worse  they also become less
frequent by the same degree.
Yes, backinsertion in a vector is an *amortized* constanttime
operation, but it's an O(n) operation.
> It's the same thing as with the tree traversal: Even though a whole traversal can be done in linear time (iow. the whole traversal takes O(N) steps), one single iterator increment is O(lg(N)) (because there are cases where it takes that many operations).
Your explanation is entirely backwards. First, it seems that you must be
describing the number of steps needed to find a specified item in a
container.
No, I'm describing a tree traversal from begin() to end(). The entire
traversal takes O(n) time even though one single iterator increment
is an O(lg(n)) operation.
(Common sense would dictate that if an iterator increment is an
O(lg(n)) operation and it's done n times, the entire traversal must
be an O(n*lg(n)) operation. However, quite curiously, it's not. It's
an O(n) operation. It's actually rather easy and intuitive to prove.)
Second, Linear complexity, O(N), is much worse than logarithmic
complexity, O(log(N))
So what? I was talking about an entire tree traversal, from
begin() to end(). It's a linear operation.
So O(N) is the expected worst case to find an item in
certain unsorted containers (such as a vector). To find an item in a set on
the other hand, has only a logarithmic worstcase complexity O(log(N))  a
significant improvement over a vector.
I was not talking about finding an item.
"Greg Herlihy" <gr****@pacbell.netwrote in message
news:C2******************@pacbell.net...
On 6/14/07 3:44 AM, in article 46**********************@news.song.fi, "Juha
Nieminen" <no****@thanks.invalidwrote:
>V.R. Marinov wrote:
>>I guess I was trying to say that rebalancing doesn't take O(lg(N)) complexity.
Note that O() and "amortized time" are different things. O() denotes the *worstcase* scenario of one single operation. Even if rebalancing the tree after a removal could be done in *amortized* constant time, it could still be O(lg(N)).
Not really. Remember that deleting an item must require no more than an
amortized constant number of operations: therefore, if erasing an item from
a set takes O(log(N)) operations (say, to rebalance the underlying RB
tree), then it must be the case that none of the preceding (or following)
items erased could have required more than a constant number of operations
to complete. In other words, in order to achieve a constant amortized
operation count, there must be a sharp limit on how often a "worst case" may
occur relative to the number of "best cases" that occur when erasing an item
from a set.
So, unlike, say, searching for an item in a set (in which every search could
require a "worst case" or O(log(N)) number of operations), erasing an item
from a set may require O(log(N)) operations only rarely (I estimate around 1
in e^N items erased could turn out to be a worst case).
.....
>
Yes, but the worst cases occur less and less frequently and are therefore
always compensated by an everincreasing number of constant "best cases". So
the average complexity per deletion when the best cases are included with
the worst cases  is still a constant.
This is not true. It depends on the order of deletion.
If each time the root of the tree is deleted, each delete operation will take
O(logN) time.
So even amortized this is O(logN) per deletion.
If you always delete a leaf node, the deletion will take O(1) time (ignoring
rebalancing).
Hans.
>
The delete version in In Introduction To Algorithms by Thomas Cormen is
the version that takes a pointer to an element (not a key that first has
to be found).
But I can't see how you can avoid the O (lg n ) time since the
subroutine 'treesuccessor' has a running time equal to the height of
the tree.
Has a worst case running time equal to the height of the tree. But the
average time is O(1).
john
>
But delete not only uses re balancing, it also uses treesuccessor which
have a running time equal to the height of the tree.
Tree successor has a worst case running time equal to the height of the
tree, but it's average time is O(1).
john
desktop wrote:
V.R. Marinov wrote:
>desktop wrote:
>>In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time.
I guess that is not worst case since std::set is practically a redblack tree where insert/delete takes O(lg n) time. Or are there some other explanation for this complexity?
You don't read carefully enough. According to the standard the function erase(T value) has log(N) complexity where N is the number of elements. This function requires searching.
And the function erase(iterator it) has constant complexity because it doesn't require searching.
As James Kanze pointed out the constant complexity is based on only
contained objects and not the extra time needed to do re balancing which
leads to a O (lg n) bound.
That may be true of the standard (standard is pretty useless in that
case IMHO), but it is still the case that to delete from a redblack
tree given the node to delete, is in the average case, a constant time
operation.
john
On 6/15/07 2:35 AM, in article 46**********************@news.song.fi, "Juha
Nieminen" <no****@thanks.invalidwrote:
Greg Herlihy wrote:
>So, unlike, say, searching for an item in a set (in which every search could require a "worst case" or O(log(N)) number of operations), erasing an item from a set may require O(log(N)) operations only rarely (I estimate around 1 in e^N items erased could turn out to be a worst case).
I think you don't really understand the O() notation. The O()
notation expresses the asymptotic *upper limit* for the time required
to perform the specified operation. Or in other words, it describes
the *worst case*, ie. the largest amount of time the operation could
possibly take. It doesn't matter if this happens rarely. O() doesn't
express the average time nor the amortized time for that operation.
You misunderstand the word "limit" as it applies to bigO notation. The word
"limit" here has the same meaning as it does in calculus. A limit is not the
maximum number of operations of a "worst case" scenario  rather a limit is
the value that the entire series of individual worst cases is converging
upon (or approaching). So a bigO complexity measure really does convey
something akin to the "average" (or expected) number of operations (in the
worst case) and really says nothing about how many operations that the worst
of the worst cases could take  a number that may greatly exceed the
complexity described by the bigO notation.
Even if erasing the element can be done in constant time one
billion times for each lg(n)time erase, erasing the element is
still an O(lg(n))time algorithm. As I said, O() expresses the
absolute worst case scenario.
No. In order for erasing an item from a set to have O(log(N)) complexity,
the limit of the series of "worstcase" deletions that could be performed
upon a set must approach O(log(N)). Yet no matter how you carefully you
might select the items to be placed in the set and no matter how carefully
you might order each deletion in order to maximize the number of operations
required  the number of operations of each deletion when divided by the
number of items deleted will not approach O(log(N)) but will instead
converge toward a constant value. In other words, deleting an item from a
set has constant and not a logarithmic complexity.
"Amortized time" is a completely different thing. It says that if
you perform the operation a very large amount of times over a typical
data set, how much *in average* it will take time.
No. Amortized complexity is not a statistical measure nor does it describe
the "average case" complexity (and the data set may not be "typical" it must
be a worst case data set). So amortized complexity describes the limit of
the series of "worst cases" (so it's similar in concept to the average
"worst" case). Note also that an amortized complexity need not be a
constant. The amortized complexity of a binary search is O(log(N)) since the
number of operations over the entire series of "worst case" searches does
not converge upon a constant but approaches an everincreasing value  its
"limit".
>A clearer example would be a std::vector that doubles its capacity whenever its size is about to exceed its current capacity. Now, even though inserting an item into this vector could require a (worst case) N number of operations  to allocate the added capacity  it should be clear that not every insertion could be a worst case.
Insertion into a vector *is* an O(n) operation. It doesn't matter if
the operation is constanttime when amortized over a large amount of
insertions. O() expresses the worst possible case of one operation.
Insertion at the start of a vector (or a fixed distance) from the start does
have linear complexity  while inserting at the end (or a fixed distance
before the end) has a constant time complexity.
>In fact the number of such worst cases is limited to 1 per every 2^N insertions. So in this case, even though the worst case has linear complexity, the average case still has a constant complexity, because as the worst cases become worse  they also become less frequent by the same degree.
Yes, backinsertion in a vector is an *amortized* constanttime
operation, but it's an O(n) operation.
No, inserting at the end of a vector has constanttime complexity. It is not
an O(N) operation by any means.
Greg
John Harrison wrote:
>> But delete not only uses re balancing, it also uses treesuccessor which have a running time equal to the height of the tree.
Tree successor has a worst case running time equal to the height of the
tree, but it's average time is O(1).
john
hehe everything seems to be running in amortized constant time now :)
Do you no of any documents that proves that treesuccessor runs in
amortized constant time?
On 20070616 00:40, Greg Herlihy wrote:
On 6/15/07 2:35 AM, in article 46**********************@news.song.fi, "Juha
Nieminen" <no****@thanks.invalidwrote:
>Greg Herlihy wrote:
>>So, unlike, say, searching for an item in a set (in which every search could require a "worst case" or O(log(N)) number of operations), erasing an item from a set may require O(log(N)) operations only rarely (I estimate around 1 in e^N items erased could turn out to be a worst case).
I think you don't really understand the O() notation. The O() notation expresses the asymptotic *upper limit* for the time required to perform the specified operation. Or in other words, it describes the *worst case*, ie. the largest amount of time the operation could possibly take. It doesn't matter if this happens rarely. O() doesn't express the average time nor the amortized time for that operation.
You misunderstand the word "limit" as it applies to bigO notation. The word
"limit" here has the same meaning as it does in calculus. A limit is not the
maximum number of operations of a "worst case" scenario  rather a limit is
the value that the entire series of individual worst cases is converging
upon (or approaching). So a bigO complexity measure really does convey
something akin to the "average" (or expected) number of operations (in the
worst case) and really says nothing about how many operations that the worst
of the worst cases could take  a number that may greatly exceed the
complexity described by the bigO notation.
> Even if erasing the element can be done in constant time one billion times for each lg(n)time erase, erasing the element is still an O(lg(n))time algorithm. As I said, O() expresses the absolute worst case scenario.
No. In order for erasing an item from a set to have O(log(N)) complexity,
the limit of the series of "worstcase" deletions that could be performed
upon a set must approach O(log(N)). Yet no matter how you carefully you
might select the items to be placed in the set and no matter how carefully
you might order each deletion in order to maximize the number of operations
required  the number of operations of each deletion when divided by the
number of items deleted will not approach O(log(N)) but will instead
converge toward a constant value. In other words, deleting an item from a
set has constant and not a logarithmic complexity.
> "Amortized time" is a completely different thing. It says that if you perform the operation a very large amount of times over a typical data set, how much *in average* it will take time.
No. Amortized complexity is not a statistical measure nor does it describe
the "average case" complexity (and the data set may not be "typical" it must
be a worst case data set). So amortized complexity describes the limit of
the series of "worst cases" (so it's similar in concept to the average
"worst" case). Note also that an amortized complexity need not be a
constant. The amortized complexity of a binary search is O(log(N)) since the
number of operations over the entire series of "worst case" searches does
not converge upon a constant but approaches an everincreasing value  its
"limit".
>>A clearer example would be a std::vector that doubles its capacity whenever its size is about to exceed its current capacity. Now, even though inserting an item into this vector could require a (worst case) N number of operations  to allocate the added capacity  it should be clear that not every insertion could be a worst case.
Insertion into a vector *is* an O(n) operation. It doesn't matter if the operation is constanttime when amortized over a large amount of insertions. O() expresses the worst possible case of one operation.
Insertion at the start of a vector (or a fixed distance) from the start does
have linear complexity  while inserting at the end (or a fixed distance
before the end) has a constant time complexity.
>>In fact the number of such worst cases is limited to 1 per every 2^N insertions. So in this case, even though the worst case has linear complexity, the average case still has a constant complexity, because as the worst cases become worse  they also become less frequent by the same degree.
Yes, backinsertion in a vector is an *amortized* constanttime operation, but it's an O(n) operation.
No, inserting at the end of a vector has constanttime complexity. It is not
an O(N) operation by any means.
Are you sure, I was under the impression that insertions in a vector
(even at the end) was amortized constant time due to the need to re
allocate now and then. Insertions at the end (or anywhere provided a
pointer/iterator) to a list on the other hand is constant since there
will never be any reallocations.

Erik Wikström
desktop wrote:
John Harrison wrote:
>>> But delete not only uses re balancing, it also uses treesuccessor which have a running time equal to the height of the tree.
Tree successor has a worst case running time equal to the height of the tree, but it's average time is O(1).
john
hehe everything seems to be running in amortized constant time now :)
Do you no of any documents that proves that treesuccessor runs in
amortized constant time?
Well, exercise 13.24 in Cormen states (in effect) prove that you can
use treesuccessor to iterate through a nnode binary search tree in
O(n) time. So each indiviual step must be O(1).
But I'm no mathematical expert.
John Harrison wrote:
desktop wrote:
>John Harrison wrote:
>>>> But delete not only uses re balancing, it also uses treesuccessor which have a running time equal to the height of the tree.
Tree successor has a worst case running time equal to the height of the tree, but it's average time is O(1).
john
hehe everything seems to be running in amortized constant time now :) Do you no of any documents that proves that treesuccessor runs in amortized constant time?
Well, exercise 13.24 in Cormen states (in effect) prove that you can
use treesuccessor to iterate through a nnode binary search tree in
O(n) time. So each indiviual step must be O(1).
But I'm no mathematical expert.
I think the proof would be along the lines of this.
A binary tree of N nodes has N1 links. If treesuccessor is called to
iterate though the tree from beginning to end then each link will be
traversed twice (once in each direction). Therefore 2N2 traverals are
made. Therefore an average of (2N2)/N traverals for each step.
Therefore O(1) for each step.
john
"John Harrison" <jo*************@hotmail.comwrote in message
news:01*************@newsfe1win.ntli.net...
desktop wrote:
>John Harrison wrote:
>>>> But delete not only uses re balancing, it also uses treesuccessor which have a running time equal to the height of the tree.
Tree successor has a worst case running time equal to the height of the tree, but it's average time is O(1).
john
hehe everything seems to be running in amortized constant time now :) Do you no of any documents that proves that treesuccessor runs in amortized constant time?
Well, exercise 13.24 in Cormen states (in effect) prove that you can use
treesuccessor to iterate through a nnode binary search tree in O(n) time. So
each indiviual step must be O(1).
This is only true as long as the tree doesn't change during the iteration.
If you iterate from the root, it takes O(logN) time to find the successor.
If that successor is moved to the root, before you iterate to it's successor,
that step will also take O(logN) time.
So it the tree is modified while iterating, the amoritized time can still be
O(logN) per iteration.
Hans.
In article <46*********************@news.xs4all.nl>, ha******@xelion.nl
says...
[ ... ]
If each time the root of the tree is deleted, each delete operation will take
O(logN) time.
So even amortized this is O(logN) per deletion.
One minor addition to note: since you are deleting nodes, the value of N
decreases each iteration, sone when you delete the last node (for
example) N is 1.
In terms of bigO notation, this has no effect since O(Log(N/2)) is the
same as O(Log(N))  both are logarithmic. In the real world it's often
worth keeping in mind though...

Later,
Jerry.
The universe is a figment of its own imagination.
John Harrison wrote:
it's average time is O(1).
Nitpicking here, but I'm not sure you should be using the O()
notation for the average time. The O notation expresses the worst
case scenario and is not related to the average cases in any way.
More correct would probably be "in average it's a constanttime
operation".
Greg Herlihy wrote:
You misunderstand the word "limit" as it applies to bigO notation. The word
"limit" here has the same meaning as it does in calculus. A limit is not the
maximum number of operations of a "worst case" scenario  rather a limit is
the value that the entire series of individual worst cases is converging
upon (or approaching).
I don't think so.
Ok, I used the wrong term: It's not the "upper limit", but the
"upper bound" (there might be a slight difference in meaning, maybe).
The bigO notation gives the asymptotically "smallest" function
which the operation never exceeds. In other words, at no time will
the operation/algorithm perform asymptotically worse than what this
function says.
For example quicksort is an O(n^2) algorithm, even though in average
it performs faster than that. The O notation simply expresses its
worst case running time.
Backinsertion in a std::vector is an O(n) operation, even if it's
amortized constanttime. This is because from time to time the insertion
will require n operations. The O notation thus expresses the upper
bound for all the possible backinsertion operations (in other words,
it basically says that backinsertion will never be slower than linear
time, but it says nothing about if it can run faster than that or how
fast it runs in average).
Saying that backinsertion into std::vector is an O(1) operation
would be wrong. It would be giving the wrong information. Saying that
is equivalent to saying "backinsertion never runs slower than
constant time", which is not true.
Likewise a map iterator increment is an O(lg(n)) operation. It tells
us that one increment can take lg(n) steps, but never more. However,
this information alone doesn't tell us how much an increment takes
in average or in the best case scenario.
> Even if erasing the element can be done in constant time one billion times for each lg(n)time erase, erasing the element is still an O(lg(n))time algorithm. As I said, O() expresses the absolute worst case scenario.
No. In order for erasing an item from a set to have O(log(N)) complexity,
the limit of the series of "worstcase" deletions that could be performed
upon a set must approach O(log(N)). Yet no matter how you carefully you
might select the items to be placed in the set and no matter how carefully
you might order each deletion in order to maximize the number of operations
required  the number of operations of each deletion when divided by the
number of items deleted will not approach O(log(N)) but will instead
converge toward a constant value. In other words, deleting an item from a
set has constant and not a logarithmic complexity.
You are still not understanding that O() tells us the upper bound
for the operation, not its average.
No, inserting at the end of a vector has constanttime complexity. It is not
an O(N) operation by any means.
O(n) tells us that "a backinsertion can take n steps, but no more".
This is true for std::vector, so the statement is true. This discussion thread is closed Replies have been disabled for this discussion. Similar topics
26 posts
views
Thread by Michael Klatt 
last post: by

7 posts
views
Thread by ma740988 
last post: by

8 posts
views
Thread by gipsy boy 
last post: by

1 post
views
Thread by Joe Gottman 
last post: by

10 posts
views
Thread by danibe 
last post: by

12 posts
views
Thread by Marcus Kwok 
last post: by

26 posts
views
Thread by Lionel B 
last post: by

7 posts
views
Thread by desktop 
last post: by

15 posts
views
Thread by desktop 
last post: by
          