State of NuPIC
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 19 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/18064 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
2015 Spring NuPIC Hackathon16 / 19
1
4
5
7
8
9
11
13
14
16
17
18
00:00
EmulationLattice (order)PlanningOffice suite3 (number)Installation artProcess (computing)StatisticsOrder (biology)Basis <Mathematik>Product (business)Multiplication signStrategy gameForm (programming)Revision controlNumberComputer virus1 (number)Right angleWordVideo gameMilitary baseUniverse (mathematics)Level (video gaming)CASE <Informatik>QuicksortVideoconferencingOffice suiteRepository (publishing)Process (computing)Digital photographyProjective planeInstallation artAnalytic continuationLinearizationAverageMetric systemOpen sourceTrailEmailElectronic mailing listYouTubePlanningTheoryPresentation of a groupScheduling (computing)Arithmetic progressionTwitterMoment (mathematics)Lattice (order)Disk read-and-write headUniform resource locatorBlogSoftware developerHypermediaFacebookFlickrDescriptive statisticsLecture/Conference
04:48
Landau theorySpecial unitary groupExecution unitLocation-based serviceMotion blurHand fanFocus (optics)Open setWechselseitige InformationRepository (publishing)Insertion lossInfinityMetropolitan area networkStatisticsPointer (computer programming)CodeJava appletCore dumpHigh-level programming languageRepository (publishing)Endliche ModelltheorieSummierbarkeitMereologyContext awarenessTheoryWordResultantDependent and independent variablesDecision theoryInsertion lossAreaOntologyNumberRight angleRule of inferenceSoftware developerProcess (computing)Self-organizationProjective planeVisualization (computer graphics)BuildingWindowExistential quantificationCycle (graph theory)Electronic mailing listMultiplication signTestbedWhiteboardCodeMathematicsQuicksortCore dumpSoftware testingBitFeedbackAlgorithmSet (mathematics)Computer clusterOpen sourceMachine visionProgrammer (hardware)Lecture/Conference
11:18
ArmMIDIWide area networkLevel (video gaming)EmulationComponent-based software engineeringComa BerenicesAlgorithmUltimatum gameMUDInfinityComputer networkMathematicsBitRun time (program lifecycle phase)AlgorithmSoftwareLevel (video gaming)DialectImplementationSlide ruleConfiguration spaceConnectivity (graph theory)HierarchyDifferent (Kate Ryan album)Electronic mailing listType theoryMoment (mathematics)BenchmarkCodeProjective planeSoftware developerFunction (mathematics)Line (geometry)Revision controlRepository (publishing)Proof theorySoftware testingWordMereologyTouch typingOverhead (computing)Codierung <Programmierung>QuicksortCodecFunctional (mathematics)CASE <Informatik>Client (computing)Interior (topology)Computer programmingPower (physics)Universe (mathematics)ResultantRight angleWeightAreaNear-ringGroup actionTheoryAnalogyParameter (computer programming)Context awarenessBit rateVideo gameWebsiteOptical disc driveSet (mathematics)Point (geometry)System callLecture/Conference
17:41
Level (video gaming)Electric currentTask (computing)Special unitary groupCodeComputing platformIndependence (probability theory)Data modelMenu (computing)TheoremPersonal area networkWechselseitige InformationMetropolitan area networkHand fanHydraulic jumpPerpetual motionSummierbarkeitModal logicLimit (category theory)WaveMaxima and minimaEmulationIntegrated development environmentDecision theoryEndliche ModelltheorieSeries (mathematics)Multiplication signNetwork topologySurfaceUniverse (mathematics)Atomic numberRight anglePosterior probabilityAreaArmDifferent (Kate Ryan album)WordMereologySerial portWindowCodeSocial classShared memoryWhiteboardMathematicsIntegrated development environmentInstallation artCommunications protocolOcean currentProjective planeView (database)Software developerTask (computing)Computer clusterCore dumpData structureVirtual machineData managementRevision controlInstance (computer science)Software testingQuicksortArithmetic progressionImplementationPoint cloudLevel (video gaming)Lecture/Conference
22:37
FeedbackPoint (geometry)WordFile formatLecture/Conference
23:08
Computer animationDiagram
Transcript: English(auto-generated)
00:11
So we've had the New Big Open Source Project now for almost two years. It's been an open source. So I try to do one of these presentations once a year,
00:22
just to give a divvy. And I've recorded, so if anybody's watching, you can kind of see the progress of the project over time. So this is a review of the past year, basically since last May when I gave the last talk to you now. So there's a trend of continued linear growth,
00:43
which is pretty much well-defined. So this is a good day. So over the past year, we've grown about 142% across the average of all of the metrics that I'm tracking. And that includes some social media stuff like Facebook and Twitter likes, but also the size of our mailing lists, the quantity of our mailing lists.
01:02
And one thing I'm not plotting here is the number of contributors who have signed our contributors license agreement. That's also growing. So healthy growth, thank you. Good thing. Over the past year, we've done 22 live planning meetings. Maybe you guys aren't aware of this,
01:20
but we do one every two weeks. The New Big engineers like Scott, myself, Jayden, Austin, Sudasai, if we're available, we are at that meeting, we're talking about what we've done over the past two weeks, the progress we've made, what the plan is and all of the other upcoming. So if you wanted to understand exactly
01:41
what we're working on day to day, you can pick one. All of these are on our YouTube channel. We have two YouTube channels, which is a little confusing. We have one sort of corporate dementia channel. The one called Official Dementia, is the one for the open source stuff that's going on.
02:04
We've had six office hours. We were doing this regularly on a monthly or monolithically basis, but we started kind of running out of questions. In this forum, we invite the community to come. Jeff is attending, Sudasai is attending.
02:21
It's more of a high level discussion. There's a lot of questions about theory, and this is a chance for you to discuss your questions about HTM theory with Jeff and Sudasai. I kind of do these now as requested basis. So if people start complaining that there hasn't been an office hour in a while,
02:40
I'll schedule an office hour. This complaining usually happens on our mailing list. If you want to complain, you have to sign up for the mailing list to complain, and then I'll take your complaints. We had two hackathons last year. You weren't aware of those. All of the videos of those are up on YouTube as well,
03:01
our YouTube channel. Just like all the videos of these will be in a couple of weeks. A lot of photos of them on Flickr and descriptions of each hackathon on our blog. We didn't notice that sound.
03:21
We had one year ago we had zero releases from YouTube. It was just head of master on the repository. Since then, we've made seven official releases. Now it's a 0.22 2.2. We're not at a 1.0
03:41
state yet, but at least we're making progress in our semantic versioning progression. We've got a release process in place, at least, and it's pretty easy to do. We've finally got a Python package up on PyPy. If you're a Python person, you know that's important to have with your Python project.
04:01
It usually works when you do PIP install. It depends. On Linux, PIP install PIP doesn't work. I don't even want to talk about why. It works if you know the exact URL of where our package is posted.
04:22
Anyway, it's big progress. We used to be able to... None of this even existed. PIP statistics, you can always see that there's activity on this repository. There's active development going on pretty much every day when you pick.
04:41
This thing is not done. We're still working on it. If you didn't know this, let me digress in a moment. We've got a lot of repositories. There is a new research repository. This is somewhat of a
05:02
little bit of an experimental repository. It's where our new big engineers at Numenta try new things and work on new algorithms, trying to implement new algorithms. We don't want to do this in the code base, but we want to make it transparent so people in the community can have an opportunity to
05:21
see what we're thinking, what we're working on, what may be coming down into new things. This is a test bed for algorithmic research and changes and once they've been vetted, then they will get merged into Numenta. We have some disclaimers on this.
05:41
It may be normal, it may not work, it may disappear completely. Sorry. It's read only. You can submit contributions, but we don't promise we'll even look at them. This is really very fast-paced research and development here. It's all for the sake of transparency, so if we didn't even see
06:01
what Numenta is working on. But I digress. This is our North American developer cycle, and here's our European development cycle. We have 153 people
06:21
who have signed our contributors license agreement. Here's perhaps some of the GitHub avatars that some people in this room are up there, who have recently contributed. I just want to say thank you to all of you who have put your volunteer time and your off-work hours for no pay to help progress this
06:41
technology. We really appreciate it. I'm really excited that we get a new person submitting a board request. If you submit a board request and I ignore it, there's something wrong and I must be out of town or something. Over the past year, the number of contributors has gone up about 65%.
07:02
But even better, the number who have pushed code to the repository, who have submitted changes that have been accepted has gone up 100%. That's great. I love that. Thank you to all of you. 34 of you of us, 34 of us have pushed code in the past year.
07:21
That's a healthy open source project. Some of our top contributors of course, Jaden is a very prolific programmer. I don't know if he's here. He's done a lot of work. ResNet is Mark Odenholm What country is he from?
07:43
No, no, no. It's a European country. I'm sorry, Mark. I know you're watching this right now. Sorry, I forgot where you're from, Mark. But Mark's done a lot of work for us. He's a committer on the Nupik project.
08:00
And then also David Gregazzi is in Brazil. He's also done a lot of work on Nupik. So these are by far Mark and David are our top committers from the community that are not employed by Nupik. So big thanks to those guys. They've done a lot to help progress the project, make it more usable and more accessible to more
08:21
people. Nupik Core or C++ set of algorithms is sort of the core of Nupik. Also Mark is a big contributor there to utensils from China. He did a lot to help improve our testing, C++ testing.
08:41
And Richard? Where's Richard? He was here. Thanks, Richard. Thank you, Richard Crowder. He's in there. Ignore him. He's helped a lot. He's busy committing code. So he's here as well. Yeah, that's his recent
09:00
feedback. Richard has helped a lot. I'll tell you in real time. But he has done most of the work to get Nupik Core building on Windows. So those of you who were disappointed that things didn't work on Windows, thank Richard that at least something's working. You can build the C code on
09:21
Windows and it works. So we're still working on the Python and the Swift stuff. He's Tim Chappa. Obviously, David Ray. Where did he go? He's a lead committer there. He's the one who did the original work.
09:41
But he's getting a lot of attention from the community. Oh, you're right there. So you've got quite a few contributors on your list now. that's great. Patrick right there, raise your hand.
10:02
His job moved him out to work with HDM. Oh, that's wonderful. Some people are sponsored. Excellent. So we're excited about this. Good stuff. Okay, one thing about I mentioned this last hackathon so we started this
10:21
GitHub organization called Nupik Community. One thing I want to make clear is that Numenta has no ownership of this organization or any of this code in here because we do not own any of the copyright or anything. If you decide to put your project in here, there are no repercussions for you.
10:42
What this really is is just a bucket that we're encouraging people to put their HDM projects so that we can go to one place and see them all. So I've been putting some of my projects in here. There's Nupik Studios in here, which is David Ragazzi's
11:01
development page, Nupik development tool and visualization tool. There's a vision repository, there's an audio repository. These are just projects for the community that want to work on certain subjects and get together and collaborate so that
11:20
it's not often just one person's GitHub and somewhere communal that anybody feels like they can jump in and get involved and I don't touch this stuff unless I'm hoping out. So there's no procedural overhead for any of these repositories. Everyone who submits a project becomes an owner into all
11:40
the projects in the repository. So if you're working on a hack and you want to share it, this is the best place to put it so other people can see it and potentially help out. Okay, so a word about our future priorities.
12:01
We recently we have this day-to-day roadmap that as developers we have that shows what's being worked on, what's coming next in our next milestone, our next version that everyone will work on. But we've recently put together a more high-level
12:22
list of initiatives that we think are the most important things to get done with medium importance, low importance. And I want to sort of explain this because I think it's important. So the first thing on the list is visualization. And we're in the midst of this
12:41
at the moment. I'll talk in more detail about that on the next slide. But I do want to talk about this pluggable components and flexible network hierarchy. And explain what that means. Okay, pluggable components. All right.
13:00
So, one of the problems that we have with contributors, especially those who want to get involved in the algorithms is that we're very anal about algorithmic changes. And we want to have a high-level scrutiny for algorithmic changes that are coming from the community.
13:21
I don't think this is a bad thing. I just think it's a bit of a roadblock for contributors who may get frustrated because the changes that they want to make aren't getting getting merged in. And there's going to be a lot of scrutiny on PR as to how they're going to change it. So one of the ways that we want to try and mitigate that, there's a couple of ways.
13:42
This is one, is to refactor nuking a bit in a way so that it's easier to extend. For example, create a pluggable encoder so you don't have to touch any of the nuking codecs. You create a new encoder, you just create your own encoder
14:01
off to the side, and then incorporate it in just like it was encoder that existed within the nuking infrastructure. Allow clients to register custom Python regions. So as a part of the Network API it can be easier for you to create your own region definitions. I think that's almost done.
14:21
It is done. And the same thing with C++ regions. And what this will allow you to do is to make modifications to the algorithms in your own custom regions. And then have your little test project so you can demonstrate how your algorithmic changes
14:42
affect the output of the algorithms versus how they run with other changes. So what we really want to see if you're working on algorithmic changes is some type of benchmark. So here's the data without your change. Here's the run with your change.
15:02
And if you can show improvement in the predictions, then that's a great thing. That's going to take you a long way. So these pluggable components will make it easier for you to set up those benchmarks. We want people to get more involved in the algorithms. But, like I said, there's a high level of scrutiny
15:21
when you get to that point, and we don't want to discourage people. So we're trying to make this easier. There's also the second thing that I wanted to mention that we're trying to help with the algorithm thing is in the
15:40
Nupik community repo, just like we have a Nupik research repository, I created a community Nupik research repository. So the way that we test new algorithms is completely outside of Nupik. We've got this Nupik research, and all those algorithms
16:02
are implemented as custom regions like I was talking about. If you want to implement new algorithms or major changes to current algorithms or whatever you want to do, I would encourage you to do it in this repository, which is sort of the community research section, because we want to see a demonstration of how your
16:22
changes improve the functionality of HTM in general. And this is a place where it can be very visible, and you can point us directly to the code, and it's easier to make the case for your algorithm changes if you can do it out here in the public, and it's not just this giant pull request against
16:42
Nupik that is very hard for us to dedicate engineering resources to review every line of that code. We would rather see the proofs in the pudding, but give us a run time that we can run, and see an example of how your changes affect performance. So, this is where I would love to see
17:01
those changes start popping up, and it will be a lot easier once we get these pluggable components all completed. The other thing aside from pluggable components was more flexible network configuration. We are still flushing this out, but it's going to be better examples of creating
17:22
these different network topologies, better tools for you guys to create networks more easily and examples. Okay. Okay. One. Yeah, like I said, the day-to-day
17:42
work is we use this roadmap for day-to-day stuff. This isn't the high-level view. This is really kind of a low-level view. This is what's being worked right now. Current development. We're working on it. We picked 0.3.0. Here are the tasks that people are working on, and how far along we are with them.
18:04
Okay, so I wanted to talk about serialization. Currently, the serialization model is seriously lacking. One, because it's all Python pickling. What Python pickling does is it prevents
18:22
backwards compatibility if you change any of the code in the Python classes. Pickling is like a serialization of a class instance. If you change the structure of the class, the name of the class, the name of the property, the size of the class, that changes the spec
18:42
for the pickle. There's a bunch of names in NewBig that we would love to change, but we can't because of the serialization method we use right now. As soon as we change them, every model anyone's ever serialized over the past five years is going to be backwards compatible and will not be able to be resurrected in the newer
19:02
version of BigPick. We're working on a new serialization technique. We have some requirements for it. It must be easily transportable. It must be done all at the C level so that there's no Python jazz at all involved in this. Fast, streamable, a bunch of
19:22
stuff. We've chosen the captain protocol for serialization, and we've got 30 specialties to sort of spearhead this, and most of the work to implement it. We've made lots of progress on this, but there's still more work to do. There's still more testing to do. We're not ready to swap it out yet. When we're
19:41
ready to swap it out, at that point, there will be a backwards compatible change of the major version of change in NewBigPick. After we do that, it's probably when we'll be ready to potentially go 1.0 for NewBigPick. But then we'll have a serialization protocol that's a lot easier to change.
20:01
As we move ahead, we won't be forced to keep our old class names and method names in Python because we only have it at C level. So that's why we're doing this, and that's why that's very high on our priority listing, because we need to get this done before we can do a lot of other things.
20:20
And the serialization is really important for the future of HTM and NewBig in general, because ideally, talking high in the sky here, if I can convince David to use the same serialization protocol, we could share models between HTM, Java, and NewPick. Other HTM implementations could potentially share models
20:42
and we could just share them around, cast them around in the cloud. Here's my model that knows everything about the crimes that have happened in San Francisco over the past 10 years. Have it. Do whatever you want with it. It's a big deal to make this work. Windows.
21:00
Big frowning case. We've really been working on Windows. It was sad when everybody went to the install area right after we kicked off yesterday. I'm like, okay, who's got Windows and half of the hands go up and I'm so sorry. Install in the end. You're going to have to run it on the end.
21:22
NewPick Core works on Windows. Like I said, thank you Richard for making me do Core work on Windows. NewPick, no, not so much. I did a lot of work on it, but I couldn't get it working. Probably because I don't have Windows machine. I'm working on changing that, but other priorities keep coming up
21:41
that I have to do as a community manager that prevents me from really digging into this meaty problem. I'm stalled. I'd love it if somebody with Windows development environment experience with Windows wants to put more time into this, but I haven't been able to. Richard. I think if you move across the board to
22:00
C-Lang, maybe C-Lang is a good part of the project. Richard says if we move entirely to C-Lang, it would make this a lot easier. That may be true. That's something we would probably have to vote on. That's a big change. We've earned this board, GCC and C-Lang.
22:21
Yeah, yeah. Something we can discuss on the panelists, for sure. In the meantime, if anybody has time and Windows experience and C and Python experience, I would like to hear from you if you're interested. Okay, that's it. That's the state. I don't know if
22:42
anybody has any comments, feedback for me. I would be happy to take it at this point. If you don't want to do it in a public forum, I'm always open to feedback, so get me to a corner somewhere and complain about whatever you're frustrated with. I'm always open to that. Thanks.
Recommendations
Series of 19 media
Series of 19 media