|
Message
From: nico at seul.org<nico@s...>
Date: Mon Aug 16 16:14:14 CEST 2004
Subject: [oc] Parallel Array Processor Project
i just read quicly your wek site.i read : "How about running redesigned 80486 with 2.8GHz clock and having one hundred of those serving your applications? "
Be carefull !! A 468 with today technology should run about 200/400 Mhz even less ! What make the difference between P4 and 486 is 20 (and now 40) pipestages.
The 122 Millions transistor chip of ATI (x800) run around 350 Mhz.
>> > That's why I have borrowed many things from VHDL to my HLL. >> > >> Why yet a new langage ? > > There's not many programming languages, which support low-level, fine- > grained parallelism. Many of them are hard to be used for general > purpose programming (e.g. APL) or they are based on sequential > constructions (LISP, Prolog). The most usable programming languages > are sequential (C, Pascal and alikes). >
I have heard about C + extention to use the paradigm of uniform memory adressing.
> [MatrioshkaBrains] >> > > I think the best model is a PC cluster. Every node contain >> > > memory >> > > and is linked to the closest other cell. Then the routing >> > > algorithm could be fractal like. >> > >> > Well, my design principles are: >> > >> > * Every node contain a memory - at least one single 8-bit >> > result and an instruction. >> >> If it's lsome tiny cpu programmed like macro-cell of FPGA are, it >> must also contain all the memory code. > > The array itself contains the code - a single cell contains only the > instruction it will execute, when fired.
So you will need a very very fast network to provide instruction to each cell:/
> >> > * Neighborhood connection: The ways to route data to distant >> > cells is up to the software. >> >> So local node decide the route to an other cell. > > Yes. > >> That's not the TCP/IP moto :/ The local node must be aware >> of the topology of the local connection. > > Well, the local node need only to move data from it's input to it's > result. > With the software you construct more complex routing. > >> It could be really hard to make a fast routing. > > True. That's why I have thought of having hardware accelerators for > special purposes, like long distance data transfers. But hardware > acceleration solutions are not scalable and thus they can be applied > only for specific size of the array. That's why the fundamental model > doesn't contain them (they are used only for HW implementations). > > For real world implementation, I have thought of having a hierarchial > structure of the array. That means, that at the bottom, there's a > processor core executing the cell instructions. They form a larger unit, > and connecting these larger units you form even larger units. >
The problem are the resulting bandwith could be low.
> But I have somewhat covered those issues on my web site.
i should read it better :)
> >> > The normal PC clusters are often topologically stars (central >> > control) or >> > full-connected (each node can contact any node in the >> > cluster). >> >> http://aggregate.org/KLAT2/ describe an other topology much more >> efficient and scalable. > > Well, yes, that's one possible hardware topology. In my design I'm still > quite sceptic to use any other than neighborhood connections, for the > reasons I described above.
I beleive you will have the same problem than place & route tools for ASIC/FPGA : some time some data should be provide to hundred of cell and that could consume a lot of cell just for routing. Some data replication/modification propagation model should be used :)
nicO
> _______________________________________________ > http://www.opencores.org/mailman/listinfo/cores
>
|
 |