Empathy List Archives

esug-list@lists.esug.org

Concurrency Best Practices + Tests

Ron Teitelbaum

Tue, Sep 10, 2019 6:46 PM

Hi Kurt,

Thank you for that link. I'd assumed parallel computing as the reason for
concurrency. Maybe that was a mistake?

I called the individual parallel processes components of the problem. I
also think in terms of the dependency tree. You really can't get any
faster than the the longest branch which resolves all the dependents
processes required to get a a result even if those dependents result from
parallel processing. Also in practice it doesn't make sense to only
consider the processing time of a component since there is overhead in
coordination and scheduling. The maximum possible speed up would approach
Amdahl’s law as the overhead time approaches 0 (which I guess is the point
of a theoretical maximum).

If on the other hand we are talking about handling multiple problems sets
concurrently and not parallel processing there are other issues to
consider. Mostly the problem becomes one of identifying available
resources and optimizing queuing processes. It doesn't make sense to have
a tiny process wait behind one that will take a very long time. Still you
need to queue based on something. In that case I would spend more time on
building in process time estimates and handling dynamic rescheduling based
on time to process vs current age in queue. The reporting system (a system
that explained what it was doing) would still work to help debug the system
but you would be looking to optimize the schedule flow to improve
efficiency. Collecting data about how it was scheduled (total work in
queue at schedule, at processing, after processing), how long it waited,
the difference between estimated processing time and actual processing time
would be more useful.

Noury, did you ever share details of the actual problem you are trying to
solve?

All the best,

Ron Teitelbaum
Chief Executive Officer**3D Immersive Collaboration Consulting, LLC
ron@3dicc.com
www.3dicc.com

https://www.facebook.com/3DICC https://twitter.com/RonTeitelbaum
https://www.linkedin.com/in/ronteitelbaum

On Tue, Sep 10, 2019 at 1:24 PM Kurt Kilpela self@kurtkilpela.com wrote:

Hey Ron,

Part of what you’ve expressed has been distilled into a ‘law’. Amdahl’s
law describes the theoretical speedup in execution latency.

https://en.wikipedia.org/wiki/Amdahl%27s_law
https://en.wikipedia.org/wiki/Amdahl's_law

Note, concurrency alone would not offer this speedup. This requires
parallel execution. Without parallelism, you would still need to wait the
aggregate computation time for all concurrent components. The latency of
the longest computation is not enough.

Kurt

On Sep 10, 2019, at 9:41 AM, Ron Teitelbaum ron@3dicc.com wrote:

Hi Noury,

I didn't respond because I really didn't think I had much to add but I've
been thinking about what I would want to have in a concurrent system. This
may be obvious but

Concurrent systems promise either speed or scalability. The goal will
change the requirements. We know that for speed there are two aspects to
consider. You can only go as fast as it takes for your largest component
to process and limited resources means that going really fast on easy to
process components of the problem set will only lead to a large backlog and
coordination issues.

For scalability you need to ensure completeness. You can't just send
stuff off and assume it will get processed. There is a lot to do to ensure
that the everything that fails or is delayed is rescheduled to prevent a
major backup.

Both of these goals, it would seem to me, would be helped considerably by
a system that can explain itself. Depending on the size of the processing
task this could get tricky. Something that processes millions of messages
a second would need to have a much more sophisticated system to gather
information without changing the efficiency of the system itself.
Something that is less time intensive could process and keep this
information with the processed results.

What I would want to know is time to process all of the components
compared to time to process the full solution. I would also want to know
about errors, rescheduling, wait times. Ultimately the best concurrent
system will run at time to run slowest component + coordination time. If
the system is not running at that speed we would like to have information
as to why. Is there a backup, deadlock or other errors, resource issues or
something else that is causing the system to delay. Comparing the
processing time of the full process and the processing time of each
component could tell you a lot about where things are going wrong (maybe
your process tree is wrong, or your scheduling is too aggressive causing
rescheduling on a regular basis or 2 times the processing time of a
component because expected data wasn't available). If this is a huge
system then you can't really keep all that information but you could keep
averages instead.

The idea is probably not that novel but having a system that could explain
itself, what it is doing, and how well it is performing, could be a very
good debugging tool. I would be very careful to build it such that it
doesn't change the processing efficiency of the production system and that
the explanation system is optional (I know they don't really go together
[optional and no effect] which makes this very difficult). That also means
that I wouldn't try to use this reporting system to automatically change
the system itself.

Hope that helps.

All the best,

Ron Teitelbaum
Chief Executive Officer**3D Immersive Collaboration Consulting, LLC
ron@3dicc.com
www.3dicc.com

https://www.facebook.com/3DICC https://twitter.com/RonTeitelbaum
https://www.linkedin.com/in/ronteitelbaum

On Tue, Sep 10, 2019 at 9:52 AM Nowak, Helge HNowak@cincom.com wrote:

Dear Noury,

I didn’t had the time to do a thorough web search. You certainly did. I
found several references that look very promising. Mny of them were
concerning Java and OS threads. Othes are about Erlang (
https://mariachris.github.io/Pubs/ERLANG-2011.pdf ). I think not only
the concurrency problem but also the base technology matters. I couldn’t
find anything about Smalltalk yet. So you are probably conquering new land.

Here is a quote from the preface of “Test Driven Development: By Example”
from 2003 by Kent Beck: “There certainly are programming tasks that can’t
be driven solely by tests (or, at least not yet). Security and concurrency,
for example, are two topics where TDD is insufficient to mechanically
demonstrate that the goals of the software has been met.” “Subtle
concurrency problems can’t be reliably duplicated by running the code.”

Good luck! I am looking forward to your findings.

Helge

*Helge Nowak *Cincom Smalltalk Technical Account Manager

<image001.png> http://www.cincomsmalltalk.com/
Cincom Systems GmbH & Co. oHG
Humboldtstraße 3
60318 Frankfurt am Main
GERMANY

office
mobile

website
email

+49 89 89 66 44 94
+49 172 74 00 402

http://www.cincomsmalltalk.com
hnowak@cincom.com

*A standpoint is an intellectual horizon of radius zero. -- Albert
Einstein *

Geschäftsführer/Managing Directors: Thomas M. Nies, Donald E. Vick
oHG mit Sitz/based in Frankfurt am Main (Amtsgericht Frankfurt am Main
HRA 50297)
Pers. haftender Gesellschafter/Partner liable to unlimited extent:
Cincom Systems Verwaltungsgesellschaft mbH (Amtsgericht Königstein/Ts. HRB
5069)

--- CONFIDENTIALITY STATEMENT ---

This e-mail transmission contains information that is intended to be
privileged and confidential. It is intended only for the addressee named
above. If you receive this e-mail in error, please do not read, copy or
disseminate it in any manner. If you are not the intended recipient, any
disclosure, copying, distribution or use of the contents of this
information is prohibited, please reply to the message immediately by
informing the sender that the message was misdirected. After replying,
please erase it from your computer system. Your assistance in correcting
this error is appreciated.

From: Esug-list esug-list-bounces@lists.esug.org *On Behalf Of *Noury
Bouraqadi
Sent: Tuesday, 10 September 2019 15:12
To: Members ESUG esug-list@lists.esug.org
Subject: Re: [Esug-list] [SPAM] Re: Concurrency Best Practices + Tests

Thanks you all for your answers.

Best,

Noury

On 7 Sep 2019, at 18:42, James Foster Smalltalk@JGFoster.net wrote:

The point of noury is what is the way to approach concurrency when doing
TDD.

Now how to build reliable ….

I won’t pretend to answer such a broad question with anything definitive,
but I’d start my investigation of concurrency with a look at databases. One
of the primary features of any multi-user database (not just GemStone!) is
that transactions are in practice concurrent but the system is designed to
apply them as if they were serial and in a way where order does not matter.
So I would attempt to identify the concurrent tasks and verify that they
each got the result that they would have obtained had they been applied in
various serial fashions.

Imagine that DrTDD should be extended to support concurrent programming.

Then this is the question that we want to get answer.

I would look for a way to pass a set of repeatable tasks to DrTDD and let
the testing framework run them in various orders then run them concurrently
with interrupts and switching contexts.

James

Esug-list mailing list
Esug-list@lists.esug.org
http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org

Hi Kurt, Thank you for that link. I'd assumed parallel computing as the reason for concurrency. Maybe that was a mistake? I called the individual parallel processes components of the problem. I also think in terms of the dependency tree. You really can't get any faster than the the longest branch which resolves all the dependents processes required to get a a result even if those dependents result from parallel processing. Also in practice it doesn't make sense to only consider the processing time of a component since there is overhead in coordination and scheduling. The maximum possible speed up would approach Amdahl’s law as the overhead time approaches 0 (which I guess is the point of a theoretical maximum). If on the other hand we are talking about handling multiple problems sets concurrently and not parallel processing there are other issues to consider. Mostly the problem becomes one of identifying available resources and optimizing queuing processes. It doesn't make sense to have a tiny process wait behind one that will take a very long time. Still you need to queue based on something. In that case I would spend more time on building in process time estimates and handling dynamic rescheduling based on time to process vs current age in queue. The reporting system (a system that explained what it was doing) would still work to help debug the system but you would be looking to optimize the schedule flow to improve efficiency. Collecting data about how it was scheduled (total work in queue at schedule, at processing, after processing), how long it waited, the difference between estimated processing time and actual processing time would be more useful. Noury, did you ever share details of the actual problem you are trying to solve? All the best, *Ron Teitelbaum* *Chief Executive Officer**3D Immersive Collaboration Consulting, LLC* ron@3dicc.com www.3dicc.com <https://www.facebook.com/3DICC> <https://twitter.com/RonTeitelbaum> <https://www.linkedin.com/in/ronteitelbaum> On Tue, Sep 10, 2019 at 1:24 PM Kurt Kilpela <self@kurtkilpela.com> wrote: > Hey Ron, > > Part of what you’ve expressed has been distilled into a ‘law’. Amdahl’s > law describes the theoretical speedup in execution latency. > > https://en.wikipedia.org/wiki/Amdahl%27s_law > <https://en.wikipedia.org/wiki/Amdahl's_law> > > Note, concurrency alone would not offer this speedup. This requires > parallel execution. Without parallelism, you would still need to wait the > aggregate computation time for all concurrent components. The latency of > the longest computation is not enough. > > Kurt > > On Sep 10, 2019, at 9:41 AM, Ron Teitelbaum <ron@3dicc.com> wrote: > > Hi Noury, > > I didn't respond because I really didn't think I had much to add but I've > been thinking about what I would want to have in a concurrent system. This > may be obvious but > > Concurrent systems promise either speed or scalability. The goal will > change the requirements. We know that for speed there are two aspects to > consider. You can only go as fast as it takes for your largest component > to process and limited resources means that going really fast on easy to > process components of the problem set will only lead to a large backlog and > coordination issues. > > For scalability you need to ensure completeness. You can't just send > stuff off and assume it will get processed. There is a lot to do to ensure > that the everything that fails or is delayed is rescheduled to prevent a > major backup. > > Both of these goals, it would seem to me, would be helped considerably by > a system that can explain itself. Depending on the size of the processing > task this could get tricky. Something that processes millions of messages > a second would need to have a much more sophisticated system to gather > information without changing the efficiency of the system itself. > Something that is less time intensive could process and keep this > information with the processed results. > > What I would want to know is time to process all of the components > compared to time to process the full solution. I would also want to know > about errors, rescheduling, wait times. Ultimately the best concurrent > system will run at time to run slowest component + coordination time. If > the system is not running at that speed we would like to have information > as to why. Is there a backup, deadlock or other errors, resource issues or > something else that is causing the system to delay. Comparing the > processing time of the full process and the processing time of each > component could tell you a lot about where things are going wrong (maybe > your process tree is wrong, or your scheduling is too aggressive causing > rescheduling on a regular basis or 2 times the processing time of a > component because expected data wasn't available). If this is a huge > system then you can't really keep all that information but you could keep > averages instead. > > The idea is probably not that novel but having a system that could explain > itself, what it is doing, and how well it is performing, could be a very > good debugging tool. I would be very careful to build it such that it > doesn't change the processing efficiency of the production system and that > the explanation system is optional (I know they don't really go together > [optional and no effect] which makes this very difficult). That also means > that I wouldn't try to use this reporting system to automatically change > the system itself. > > Hope that helps. > > All the best, > > > *Ron Teitelbaum* > *Chief Executive Officer**3D Immersive Collaboration Consulting, LLC* > ron@3dicc.com > www.3dicc.com > > <https://www.facebook.com/3DICC> <https://twitter.com/RonTeitelbaum> > <https://www.linkedin.com/in/ronteitelbaum> > > > On Tue, Sep 10, 2019 at 9:52 AM Nowak, Helge <HNowak@cincom.com> wrote: > >> Dear Noury, >> >> >> >> I didn’t had the time to do a thorough web search. You certainly did. I >> found several references that look very promising. Mny of them were >> concerning Java and OS threads. Othes are about Erlang ( >> https://mariachris.github.io/Pubs/ERLANG-2011.pdf ). I think not only >> the concurrency problem but also the base technology matters. I couldn’t >> find anything about Smalltalk yet. So you are probably conquering new land. >> >> >> >> Here is a quote from the preface of “Test Driven Development: By Example” >> from 2003 by Kent Beck: “There certainly are programming tasks that can’t >> be driven solely by tests (or, at least not yet). Security and concurrency, >> for example, are two topics where TDD is insufficient to mechanically >> demonstrate that the goals of the software has been met.” “Subtle >> concurrency problems can’t be reliably duplicated by running the code.” >> >> >> >> Good luck! I am looking forward to your findings. >> >> >> >> Helge >> >> >> >> >> *Helge Nowak *Cincom Smalltalk Technical Account Manager >> >> <image001.png> <http://www.cincomsmalltalk.com/> >> Cincom Systems GmbH & Co. oHG >> Humboldtstraße 3 >> 60318 Frankfurt am Main >> GERMANY >> >> office >> mobile >> >> website >> email >> >> +49 89 89 66 44 94 >> +49 172 74 00 402 >> >> http://www.cincomsmalltalk.com >> hnowak@cincom.com >> >> >> >> *A standpoint is an intellectual horizon of radius zero. -- Albert >> Einstein * >> >> Geschäftsführer/Managing Directors: Thomas M. Nies, Donald E. Vick >> oHG mit Sitz/based in Frankfurt am Main (Amtsgericht Frankfurt am Main >> HRA 50297) >> Pers. haftender Gesellschafter/Partner liable to unlimited extent: >> Cincom Systems Verwaltungsgesellschaft mbH (Amtsgericht Königstein/Ts. HRB >> 5069) >> >> >> --- CONFIDENTIALITY STATEMENT --- >> >> This e-mail transmission contains information that is intended to be >> privileged and confidential. It is intended only for the addressee named >> above. If you receive this e-mail in error, please do not read, copy or >> disseminate it in any manner. If you are not the intended recipient, any >> disclosure, copying, distribution or use of the contents of this >> information is prohibited, please reply to the message immediately by >> informing the sender that the message was misdirected. After replying, >> please erase it from your computer system. Your assistance in correcting >> this error is appreciated. >> >> >> >> >> >> *From:* Esug-list <esug-list-bounces@lists.esug.org> *On Behalf Of *Noury >> Bouraqadi >> *Sent:* Tuesday, 10 September 2019 15:12 >> *To:* Members ESUG <esug-list@lists.esug.org> >> *Subject:* Re: [Esug-list] [SPAM] Re: Concurrency Best Practices + Tests >> >> >> >> Thanks you all for your answers. >> >> >> >> Best, >> >> Noury >> >> On 7 Sep 2019, at 18:42, James Foster <Smalltalk@JGFoster.net> wrote: >> >> >> >> >> >> The point of noury is what is the way to approach concurrency when doing >> TDD. >> >> Now how to build reliable …. >> >> >> >> I won’t pretend to answer such a broad question with anything definitive, >> but I’d start my investigation of concurrency with a look at databases. One >> of the primary features of any multi-user database (not just GemStone!) is >> that transactions are in practice concurrent but the system is designed to >> apply them as if they were serial and in a way where order does not matter. >> So I would attempt to identify the concurrent tasks and verify that they >> each got the result that they would have obtained had they been applied in >> various serial fashions. >> >> >> >> Imagine that DrTDD should be extended to support concurrent programming. >> >> Then this is the question that we want to get answer. >> >> >> >> I would look for a way to pass a set of repeatable tasks to DrTDD and let >> the testing framework run them in various orders then run them concurrently >> with interrupts and switching contexts. >> >> >> >> James >> >> _______________________________________________ >> Esug-list mailing list >> Esug-list@lists.esug.org >> http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org >> >> >> _______________________________________________ >> Esug-list mailing list >> Esug-list@lists.esug.org >> http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org >> > _______________________________________________ > Esug-list mailing list > Esug-list@lists.esug.org > http://lists.esug.org/mailman/listinfo/esug-list_lists.esug.org > > >