shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Chris Bowers
Authors,
 
I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 
 
In reading the latest version of the document, I wrote down some feedback.
A diff can be found at:
 
 
Most of the feedback is related to clarifying language and typos.  However there
are few comments that I think are more substantive so I am
reproducing them below since they should probably discussed on the list.
 
===========
[CB]  I find the examples presented in section 1 and section 2.1 to
be confusing.  The conclusion drawn in the last paragraph of section
2.1 does not seem to follow from these examples.
 
Section 1 (figure 1) shows an example of micro-loops occuring when shortest
path forwarding is used and the metrics are such that LFA and rLFA
produce no backup paths from the PLR. 
 
Section 2.1 (figure 2) also shows an example of micro-loops occuring when
shortest path forwarding is used and the metrics are such that LFA and rLFA
produce no backup paths from the PLR.  However, in this example,
a one-hop RSVP tunnel is provisioned to provide link protection for one of
the links.  However, even with this one-hop RSVP tunnel the example
demonstrates that micro-loops can occur.
 
The last paragraph asserts that:
"The issue described here is completely independent of the fast-
reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."
 
There are two problems with this assertion.
 
Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.
 
For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 
Before the failure of the link C-B, this LSP would follow the path
S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would
follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is
made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 
At no time would looping occur. 
 
I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from
micro-looping, but the text currently reads that way.  The assertion of the last
paragraph should be qualified to talk about how microloops will still affect traffic
forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.
 
Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not
been demonstrated with the examples provided.  I think it may instead be
the case that the assertion nay not be true for local LFA in some circumstances.
In particular, if traffic to a given destination can be protected for a given
failure by the PLR using a local LFA that is the same as the post convergence
path, then that traffic will not be subject to microloops.
 
Perhaps the overall intention of the example in figure 2 using
links protected with one-hop RSVP-signaled LSPs was to say that no
matter how much flexibility you give yourself in building a backup path
from the PLR, if the PLR stops using the backup path before other routers
stop sending traffic to the PLR, then you can still have forwarding loops.
However, I think the complexity and detail of the example using one-hop
RSVP-signaled LSPs ends up confusing the matter.
 
The text should either work more systematically through examples to
substantiate the assertion, or the assertion should be scaled back. 
Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.
 
====== 
Section 4.4
 
[CB]  It would be good to write out exactly what the modified version of step 5
looks like so there is no confusion. Something like:
 
5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition
of a single local link-down event have been met, then an update of the
RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,
the RIB and FIB update is scheduled immediately.
 
=========
 
   Such a delay
   SHOULD only be introduced if all the LSDB modifications processed are
   only reporting a single local link down event (Section 4.3).  If a
   subsequent LSP/LSA is received/updated and a new SPF computation is
   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the
   same evaluation SHOULD be performed.
 
=========  
[CB] What should one do if the evaluation of a subsequent LSP/LSA fails
at this point?  Do you go ahead and update the FIB with the forwarding
entries that you were waiting to do?  Or do you do a new SPF with the
new information?  Or is it up to the implementation?
=========
 
I also ran the idnits check which show  the following issues.
Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 
 
 
Thanks,
Chris
 
 

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

stephane.litkowski-2

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [mailto:[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Chris Bowers

Stephane,

 

See responses inline with [CB].

 

Chris

 

From: [hidden email] [mailto:[hidden email]]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <[hidden email]>
Cc: RTGWG <[hidden email]>; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

[CB]  You are correct.  That proposed change is a mistake on my part.

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

 

[CB] “when the primary path is an hop by hop defined path”  is somewhat ambiguous.

How about “when the primary path uses hop-by-hop routing” ?

 

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

[CB]  OK. 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

[CB]  This text seems clearer.

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

stephane.litkowski-2

Thanks Chris, I will post a new revision with those changes.

 

 

From: Chris Bowers [mailto:[hidden email]]
Sent: Tuesday, August 08, 2017 16:05
To: LITKOWSKI Stephane OBS/OINIS
Cc: RTGWG; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Stephane,

 

See responses inline with [CB].

 

Chris

 

From: [hidden email] [[hidden email]]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <[hidden email]>
Cc: RTGWG <[hidden email]>; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

[CB]  You are correct.  That proposed change is a mistake on my part.

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

 

[CB] “when the primary path is an hop by hop defined path”  is somewhat ambiguous.

How about “when the primary path uses hop-by-hop routing” ?

 

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

[CB]  OK. 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

[CB]  This text seems clearer.

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Sikhivahan Gundu

Hi,

 

Requesting a couple of clarifications.

 

>> If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped

 

Do we stop the timer if  “new convergence” is a result only of links coming

up, i.e, no links have failed?  My interpretation of the old text,  as well as the

revision, is that we don’t, but in the light of the discussion that this passage

triggered, it seems better to have the interpretation validated, as below:

Imagining the IGP router to be in one of two states:

-- NORMAL-UPDATE state (FIB updated “normally”), also the initial state,

-- and DELAYED-UPDATE state (FIB updated after ULOOP_DELAY_TIMER units of time),

   

the draft seems to suggest the following state transitions. I’d greatly appreciate

validation.

 

---------------------------+------------------------------------------------------+-------------------------+

     current state        |                                  event                             |         next state      |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+                  one local link failure                  +-------------------------+                

DELAYED-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                  one remote link failure             |                                   |

---------------------------+                                    OR                               | NORMAL-UPDATE |

DELAYED-UPDATE    |  two or more (any kind of) link failures   |                                  |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+           no link failures (only link-up's)       +--------------------------+

DELAYED-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

 

Second: remote loops are illustrated as a non-applicable scenario for this

solution. How about local link failures that do not lead to (local) loops?

Applying the delay in such a case may result in packet loss if there is no

FRR backup.  OTOH, detecting that a local loop will form  involves more

computation.   

 

Thanks,

Sikhi

 

 

From: rtgwg [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: 08 August 2017 20:50
To: Chris Bowers <[hidden email]>
Cc: [hidden email]; RTGWG <[hidden email]>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Thanks Chris, I will post a new revision with those changes.

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 16:05
To: LITKOWSKI Stephane OBS/OINIS
Cc: RTGWG; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Stephane,

 

See responses inline with [CB].

 

Chris

 

From: [hidden email] [[hidden email]]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <[hidden email]>
Cc: RTGWG <[hidden email]>; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

[CB]  You are correct.  That proposed change is a mistake on my part.

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

 

[CB] “when the primary path is an hop by hop defined path”  is somewhat ambiguous.

How about “when the primary path uses hop-by-hop routing” ?

 

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

[CB]  OK. 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

[CB]  This text seems clearer.

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

stephane.litkowski-2

Hi,

 

Thanks for your feedback, please find some comments inline.

 

Brgds,

 

Stephane

 

 

From: Sikhivahan Gundu [mailto:[hidden email]]
Sent: Wednesday, August 09, 2017 12:19
To: LITKOWSKI Stephane OBS/OINIS; Chris Bowers
Cc: [hidden email]; RTGWG
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi,

 

Requesting a couple of clarifications.

 

>> If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped

 

Do we stop the timer if  “new convergence” is a result only of links coming

up, i.e, no links have failed?  My interpretation of the old text,  as well as the

revision, is that we don’t, but in the light of the discussion that this passage

triggered, it seems better to have the interpretation validated, as below:

 

[SLI] Let’s that you have a convergence triggered by a local link down, this convergence will apply the ULOOP_DELAY_DOWN_TIMER.

If during the timer run, a new topology change occurs (metric change, link up or down whatever it is local or remote), we need to update the FIB without anymore delaying with the latest topology.

If we do not do so, the local router will use an N-2 FIB version while the other routers will start to use the latest version N this could cause side effects.

 

 

Imagining the IGP router to be in one of two states:

-- NORMAL-UPDATE state (FIB updated “normally”), also the initial state,

-- and DELAYED-UPDATE state (FIB updated after ULOOP_DELAY_TIMER units of time),

   

the draft seems to suggest the following state transitions. I’d greatly appreciate

validation.

 

---------------------------+------------------------------------------------------+-------------------------+

     current state        |                                  event                             |         next state      |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+                  one local link failure                  +-------------------------+                

DELAYED-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                  one remote link failure             |                                   |

---------------------------+                                    OR                               | NORMAL-UPDATE |

DELAYED-UPDATE    |  two or more (any kind of) link failures   |                                  |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+           no link failures (only link-up's)       +--------------------------+

DELAYED-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

 

[SLI] The last line should be current state DELAYED-UPDATE , next state NORMAL-UPDATE.

 

 

Second: remote loops are illustrated as a non-applicable scenario for this

solution. How about local link failures that do not lead to (local) loops?

Applying the delay in such a case may result in packet loss if there is no

FRR backup.  OTOH, detecting that a local loop will form  involves more

computation.   

 

[SLI] I agree with you, that’s why the draft encourages to use the mechanism in combination with FRR. The draft does not prevent an implementation to detect if a loop exists or not before applying the mechanism.

 

 

Thanks,

Sikhi

 

 

From: rtgwg [[hidden email]] On Behalf Of [hidden email]
Sent: 08 August 2017 20:50
To: Chris Bowers <[hidden email]>
Cc: [hidden email]; RTGWG <[hidden email]>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Thanks Chris, I will post a new revision with those changes.

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 16:05
To: LITKOWSKI Stephane OBS/OINIS
Cc: RTGWG; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Stephane,

 

See responses inline with [CB].

 

Chris

 

From: [hidden email] [[hidden email]]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <[hidden email]>
Cc: RTGWG <[hidden email]>; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

[CB]  You are correct.  That proposed change is a mistake on my part.

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

 

[CB] “when the primary path is an hop by hop defined path”  is somewhat ambiguous.

How about “when the primary path uses hop-by-hop routing” ?

 

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

[CB]  OK. 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

[CB]  This text seems clearer.

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

Sikhivahan Gundu

Hi Stephane,

 

>> If during the timer run, a new topology change occurs (metric change, link up or down whatever it is local or remote), we need to update the

 

I’ve been wrongly assuming that the delay is applied only to those

entries that are likely to cause microloops, but now realize the draft

is advocating delaying the entire IGP routing table. In that case, any

topology change that follows the original link failure will of course

have to abort the delay.

 

Thanks for the clarification.

 

Sikhi

 

 

From: [hidden email] [mailto:[hidden email]]
Sent: 09 August 2017 17:07
To: Sikhivahan Gundu <[hidden email]>; Chris Bowers <[hidden email]>
Cc: [hidden email]; RTGWG <[hidden email]>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi,

 

Thanks for your feedback, please find some comments inline.

 

Brgds,

 

Stephane

 

 

From: Sikhivahan Gundu [[hidden email]]
Sent: Wednesday, August 09, 2017 12:19
To: LITKOWSKI Stephane OBS/OINIS; Chris Bowers
Cc: [hidden email]; RTGWG
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi,

 

Requesting a couple of clarifications.

 

>> If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped

 

Do we stop the timer if  “new convergence” is a result only of links coming

up, i.e, no links have failed?  My interpretation of the old text,  as well as the

revision, is that we don’t, but in the light of the discussion that this passage

triggered, it seems better to have the interpretation validated, as below:

 

[SLI] Let’s that you have a convergence triggered by a local link down, this convergence will apply the ULOOP_DELAY_DOWN_TIMER.

If during the timer run, a new topology change occurs (metric change, link up or down whatever it is local or remote), we need to update the FIB without anymore delaying with the latest topology.

If we do not do so, the local router will use an N-2 FIB version while the other routers will start to use the latest version N this could cause side effects.

 

 

Imagining the IGP router to be in one of two states:

-- NORMAL-UPDATE state (FIB updated “normally”), also the initial state,

-- and DELAYED-UPDATE state (FIB updated after ULOOP_DELAY_TIMER units of time),

   

the draft seems to suggest the following state transitions. I’d greatly appreciate

validation.

 

---------------------------+------------------------------------------------------+-------------------------+

     current state        |                                  event                             |         next state      |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+                  one local link failure                  +-------------------------+                

DELAYED-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                  one remote link failure             |                                   |

---------------------------+                                    OR                               | NORMAL-UPDATE |

DELAYED-UPDATE    |  two or more (any kind of) link failures   |                                  |

---------------------------+------------------------------------------------------+-------------------------+

NORMAL-UPDATE    |                                                                        | NORMAL-UPDATE |

---------------------------+           no link failures (only link-up's)       +--------------------------+

DELAYED-UPDATE    |                                                                        | DELAYED-UPDATE |

---------------------------+------------------------------------------------------+-------------------------+

 

[SLI] The last line should be current state DELAYED-UPDATE , next state NORMAL-UPDATE.

 

 

Second: remote loops are illustrated as a non-applicable scenario for this

solution. How about local link failures that do not lead to (local) loops?

Applying the delay in such a case may result in packet loss if there is no

FRR backup.  OTOH, detecting that a local loop will form  involves more

computation.   

 

[SLI] I agree with you, that’s why the draft encourages to use the mechanism in combination with FRR. The draft does not prevent an implementation to detect if a loop exists or not before applying the mechanism.

 

 

Thanks,

Sikhi

 

 

From: rtgwg [[hidden email]] On Behalf Of [hidden email]
Sent: 08 August 2017 20:50
To: Chris Bowers <[hidden email]>
Cc: [hidden email]; RTGWG <[hidden email]>
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Thanks Chris, I will post a new revision with those changes.

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 16:05
To: LITKOWSKI Stephane OBS/OINIS
Cc: RTGWG; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Stephane,

 

See responses inline with [CB].

 

Chris

 

From: [hidden email] [[hidden email]]
Sent: Tuesday, August 8, 2017 8:25 AM
To: Chris Bowers <[hidden email]>
Cc: RTGWG <[hidden email]>; [hidden email]
Subject: RE: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Hi Chris,

 

Thanks for the review. I’m updating the document to reflect your proposals.

Couple of comments:

-          s/“otherwise the standard IP convergence MUST be used.”/ “otherwise the standard IP convergence MUST used”. It does not sound good to me but may be because of an English grammar issue on my side. Could you confirm the change ?

 

[CB]  You are correct.  That proposed change is a mistake on my part.

 

-          Regarding your main comment on section 1 and 2.1, I do not agree about your statement on RSVP-FRR. First there are multiple deployment styles of RSVP FRR:

o   LDP tunneling

o   RSVP with no strict ERO

o   RSVP with CSPF at head end (strict ERO)

Your statement is true only for the third case where an RSVP tunnel between S and D exists with its path computed by S => no uloop in that case for sure. But as soon as you rely on distributed convergence, you will fall into a loop even if you use RSVP-FRR. I will precise in the text that we are in an LDP scenario for example. Here is a text proposal:

“In the Figure 2, we consider an IP/LDP routed network. An RSVP-TE tunnel T, provisioned on C and terminating on B, is used to protect the traffic against C-B link failure (IGP shortcut is activated on C).”

“The issue described here is completely independent of the fast-reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...) when the primary path is an hop by hop defined path.”

 

 

[CB] “when the primary path is an hop by hop defined path”  is somewhat ambiguous.

How about “when the primary path uses hop-by-hop routing” ?

 

 

For the LFA case, yes, there are some cases where there is no loop, but it is topology dependent. I’m not sure that we need to give such precision as if the LFA is on the postconvergence path, this means that the postconvergence is loopfree, so there will be no local microloop in any case.

 

[CB]  OK. 

 

-          Regarding your comment on section 4.4, here is my new text proposal to fit your comment:

                       

“Upon an adjacency/link down event, this document introduces a change

   in step 5 (<xref target="description-current"/>) in order to delay the local convergence compared to the

   network wide convergence. The new step 5 is described below:”

           5. Upon SPF_DELAY timer expiration, the SPF is computed. If the condition of a single local link-down event has been met and if the new convergence did not trigger a stop of the ULOOP_DELAY_DOWN_TIMER , then an update of the RIB and the FIB SHOULD be delayed for ULOOP_DELAY_DOWN_TIMER msecs. Otherwise, the RIB and FIB SHOULD be updated immediately.

 

If a new convergence occurs while ULOOP_DELAY_DOWN_TIMER is running, ULOOP_DELAY_DOWN_TIMER is stopped and the RIB/FIB SHOULD be updated as part of the new convergence event.”

 

[CB]  This text seems clearer.

 

Brgds,

 

Stephane

 

 

From: Chris Bowers [[hidden email]]
Sent: Tuesday, August 08, 2017 03:01
To: LITKOWSKI Stephane OBS/OINIS; [hidden email]
Cc: [hidden email]
Subject: shepherd feedback and idnits on draft-ietf-rtgwg-uloop-delay-05.txt

 

Authors,

 

I’m in the process of doing the Shepherd write-up for draft-ietf-rtgwg-uloop-delay-05.txt. 

 

In reading the latest version of the document, I wrote down some feedback.

A diff can be found at:

 

 

Most of the feedback is related to clarifying language and typos.  However there

are few comments that I think are more substantive so I am

reproducing them below since they should probably discussed on the list.

 

===========

[CB]  I find the examples presented in section 1 and section 2.1 to

be confusing.  The conclusion drawn in the last paragraph of section

2.1 does not seem to follow from these examples.

 

Section 1 (figure 1) shows an example of micro-loops occuring when shortest

path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR. 

 

Section 2.1 (figure 2) also shows an example of micro-loops occuring when

shortest path forwarding is used and the metrics are such that LFA and rLFA

produce no backup paths from the PLR.  However, in this example,

a one-hop RSVP tunnel is provisioned to provide link protection for one of

the links.  However, even with this one-hop RSVP tunnel the example

demonstrates that micro-loops can occur.

 

The last paragraph asserts that:

"The issue described here is completely independent of the fast-

reroute mechanism involved (TE FRR, LFA/rLFA, MRT ...)."

 

There are two problems with this assertion.

 

Problem 1) I don't think that the assertion is correct for RSVP TE-FRR in general.

 

For classical RSVP TE-FRR, there would be an RSVP-signaled LSP from S to D. 

Before the failure of the link C-B, this LSP would follow the path

S-E-C-B-A-D.  Immediately after the failure of link C-B, the LSP would

follow the path S-E-C-E-A-B-A-D using the bypass LSP at C.  Once S is

made aware of the failure.  S will resignal the LSP to take the path S-E-A-D. 

At no time would looping occur. 

 

I assume that it wasn't the initial intention to claim that RSVP TE-FRR suffers from

micro-looping, but the text currently reads that way.  The assertion of the last

paragraph should be qualified to talk about how microloops will still affect traffic

forwarded hop-by-hop over links protected with one-hop RSVP-signaled LSPs.

 

Problem 2) The assertion may be correct for LFA/rLFA and MRT, but it has not

been demonstrated with the examples provided.  I think it may instead be

the case that the assertion nay not be true for local LFA in some circumstances.

In particular, if traffic to a given destination can be protected for a given

failure by the PLR using a local LFA that is the same as the post convergence

path, then that traffic will not be subject to microloops.

 

Perhaps the overall intention of the example in figure 2 using

links protected with one-hop RSVP-signaled LSPs was to say that no

matter how much flexibility you give yourself in building a backup path

from the PLR, if the PLR stops using the backup path before other routers

stop sending traffic to the PLR, then you can still have forwarding loops.

However, I think the complexity and detail of the example using one-hop

RSVP-signaled LSPs ends up confusing the matter.

 

The text should either work more systematically through examples to

substantiate the assertion, or the assertion should be scaled back. 

Regardless, the assertion needs to be clarified with respect to RSVP-TE FRR.

 

====== 

Section 4.4

 

[CB]  It would be good to write out exactly what the modified version of step 5

looks like so there is no confusion. Something like:

 

5.  Upon SPF_DELAY timer expiration, the SPF is computed.  If the condition

of a single local link-down event have been met, then an update of the

RIB and the FIB is scheduled in ULOOP_DELAY_DOWN_TIMER msecs.  Otherwise,

the RIB and FIB update is scheduled immediately.

 

=========

 

   Such a delay

   SHOULD only be introduced if all the LSDB modifications processed are

   only reporting a single local link down event (Section 4.3).  If a

   subsequent LSP/LSA is received/updated and a new SPF computation is

   triggered before the expiration of ULOOP_DELAY_DOWN_TIMER, then the

   same evaluation SHOULD be performed.

 

=========  

[CB] What should one do if the evaluation of a subsequent LSP/LSA fails

at this point?  Do you go ahead and update the FIB with the forwarding

entries that you were waiting to do?  Or do you do a new SPF with the

new information?  Or is it up to the implementation?

=========

 

I also ran the idnits check which show  the following issues.

Can you get rid of the unused references and move RFC 5715 from Normative to informational so that idnits will run clean? 

 

 

Thanks,

Chris

 

 

_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
 
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
 
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
rtgwg mailing list
[hidden email]
https://www.ietf.org/mailman/listinfo/rtgwg
Loading...