Sunday, April 12, 2020

Tech Book Face Off: Programming Massively Parallel Processors Vs. Professional CUDA C Programming

After getting an introduction to GPU programming with CUDA by Example, I wanted to dig in deeper and get to know the real ins and outs of CUDA programming. That desire quickly lead to the selection of books for this Tech Book Face Off. The first book is definitely geared to be a college textbook, and as I spent years learning from books like this, I felt comfortable taking a look at Programming Massively Parallel Processors: A Hands-on Approach by David B. Kirk and Wen-mei W. Hwu. The second book is targeted more at the working professional, as the title suggests: Professional CUDA C Programming by John Cheng, Max Grossman, and Ty McKercher. I was surprised by both books, and not in the same way. Let's see how they do at teaching CUDA programming.

Programming Massively Parallel Multiprocessors front cover

VS.

Professional CUDA C Programming front cover

Programming Massively Parallel Processors

The polite way to critique this book is to say, it's somewhat verbose and repetitive, but if you can get past that, it has a lot to offer in the way of example CUDA programs that show how to optimize code for the GPU architecture. A slightly less polite way to say that would be that while this book does offer some good code examples, the writing leaves much to be desired, and much better books are out there that cover the same material. The honest assessment is that this book is just a mess. Half the book could be cut and the other half rewritten to better explain things with clearer, non-circular definitions. The only good thing about the book is the code examples, and many of those examples are also redundant, filling the pages of the book with lines of code that the reader has seen multiple times before. This book could have been a third the length and covered the same amount of material.

Even though that last bit was a pretty harsh review, we should still explore what's in the book, if only to see how the breadth of material compares to Professional CUDA C Programming. The first chapter is the normal introduction to the book's material, describing the architecture of a GPU and discussing how parallel programming with this architecture is so different than programming on a CPU. The verbosity of this chapter alone should have been a clue that this book would drag on and on, but I was willing to give it a chance. The next chapter introduces our first real CUDA program with a vector addition kernel. We're still getting started with CUDA C at this point, so I chalk up the authors' overly detailed explanations to taking extra care with novice readers. We end up walking through all of the parts of a working CUDA program, explaining everything in excruciating detail.

The third chapter covers how to work more efficiently with threads and loading data into GPU memory from the CPU with a more complex example of calculating image blur. We also get our first exposure to thread synchronization, something that must be thoroughly understood to program GPUs effectively. This chapter is also where I start to realize how nutty some of the explanations are getting. Here's just one example of them describing how arrays are laid out in memory:

A two-dimensional array can be linearized in at least two ways. One way is to place all elements of the same row into consecutive locations. The rows are then placed one after another into the memory space. This arrangement, called row-major layout, is depicted in Fig. 3.3. To improve readability, we will use M_j,i to denote the M element at the jth row and the ith column. P_j,i is equivalent to the C expression M[j][i] but is slightly more readable. Fig. 3.3 illustrates how a 4×4 matrix M is linearized into a 16-element one-dimensional array, with all elements of row 0 first, followed by the four elements of row 1, and so on. Therefore, the one-dimensional equivalent index for M in row j and column i is j*4 +i. The j*4 term skips all elements of the rows before row j. The i term then selects the right element within the section for row j. The one-dimensional index for M_2,1 is 2*4 +1 =9, as shown in Fig. 3.3, where M₉ is the one-dimensional equivalent to M_2,1. This process shows the way C compilers linearize two-dimensional arrays.

Wow. I'm not sure how a reader that needs this level of detail for understanding how a matrix is arranged in memory is going to understand the memory hierarchy and synchronization issues of GPU programming. This explanation is just too much for a book like this. Readers should already have some knowledge of standard C programming, including multi-dimensional array memory layout, before attempting CUDA programming. I can't imagine learning both at the same time going very well. As for readers who already know how all of this stuff works, they could easily skip every other paragraph and skim the rest to make trudging through these explanations more tolerable.

The next chapter is on how to manage memory and arrange data access to optimize memory usage and bandwidth. We find that memory management is just as, if not more important than thread management for making optimal use of the GPU computing resources, and the book solidifies this understanding through an extended optimization example of a matrix multiplication kernel.

At this point we've learned the fundamentals of GPU programming, so the next chapter moves into more advanced topics in performance optimization with the memory hierarchy and the compute core architecture. Then, chapter six covers number format considerations between integers and single- and double-precision floating point representations. The authors' definition of representable numbers struck me as exceptionally cringe-worthy here:

The representable numbers of a representation format are the numbers that can be exactly represented in the format.

This is but one example of their impenetrable and useless definitions. More often than not, I found that if I hadn't already known what they were talking about, their discussions would provide no further illumination.

Now we get into the halfway decent part of the book, the extended example chapters on parallel patterns. Each of these chapters works through a different example kernel of a particular problem that comes up often in parallel programming, and they introduce additional features of GPU programming that can assist in solving these problems in a more optimal way. The contents of these chapters are as follows:

Chapter 7: Convolution
Chapter 8: Prefix Sum (Accumulator)
Chapter 9: Parallel Histogram Calculation
Chapter 10: Sparse Matrix Computation
Chapter 11: Merge Sort
Chapter 12: Graph Search

As long as you skim the descriptions of the problems and solutions, and focus on understanding the code yourself, these chapters are quite useful examples of how to write performant parallel programs with CUDA. However, the authors continue to suffer from what seems to be a mis-interpretation of the phrase, "a picture is worth a thousand words." For every diagram they use, they also include a thousand words or more of explanation, describing the diagrams ad nauseam.

The next chapter covers how to kick off kernels from within other kernels in order to enable dynamic parallelism. Up until this point, all kernels have been launched from the host (CPU) code, but it is possible to have kernels launch other kernels to take advantage of additional parallelism while the code is executing on the GPU, an effective feature for some algorithms. Then, the next three chapters are fairly useful application case studies. Like the parallel pattern example chapters, these chapters use CUDA code to show how to take advantage of more advanced features of the GPU, and how to put together everything we've learned so far to optimize some example parallel programs. The applications described are for non-Cartesian MRI, molecular visualization and analysis, and machine learning neural networks, so nice, interesting topics for GPU programming.

The last five chapters were either more drudgery or topics I wasn't interested in, so I skipped them and called it quits for this long and tedious book. For completeness, those chapters are on how to think when parallel programming (so a pep talk on what to think about from authors that couldn't clearly describe much else in the book), multi-GPU programming, OpenACC (another GPU programming framework, like CUDA), still more performance considerations, and a summary chapter.

I couldn't bring myself to keep reading chapters that wouldn't amount to anything, so I put down the book after finishing the last chapter on application case studies. I found that chapters seven through sixteen contained most of the useful information in the book, but the introduction to CUDA programming was too verbose and confusing. There are much better books out there for learning that part of CUDA programming. Case in point: CUDA by Example or the next book in this review.

Professional CUDA C Programming

Unlike the last book, I was surprised by how readable this book was. The authors did an excellent job of presenting concepts in CUDA programming in a clear, direct, and succinct manner. They also did this without resorting to humor, which can sometimes work if the author is an excellent writer, but it often feels forced and lame when done poorly. It's better to stick to clear descriptions and tight writing, as these authors did quite well. I was actually disappointed that I didn't read this book first, instead saving it until last, because it did the best job of explaining all of the CUDA programming concepts while covering essentially the same material as Programming Massively Parallel Processors and certainly more than CUDA by Example.

The first chapter is the obligatory introduction to CUDA with the requisite Hello, World program showing how to run code on the GPU. Right away, we can see how well-written the descriptions are with this discussion of how parallel programming is different than sequential programming:

When implementing a sequential algorithm, you may not need to understand the details of the computer architecture to write a correct program. However, when implementing algorithms for multicore machines, it is much more important for programmers to be aware of the characteristics of the underlying computer architecture. Writing both correct and efficient parallel programs requires a fundamental knowledge of multicore architectures.

We need to be prepared to think differently about problems when parallel programming, and we're going to have to learn the architecture of the underlying hardware to make full use of it. That leads us right into chapter 2, where we learn about the CUDA programming model and how to organize threads on the device, but it doesn't end there. Throughout the book we're learning more and more about the nVidia GPU architecture (specifically the older Fermi and Kepler architectures, since those were available at the time of the book's writing) in order to take full advantage of its compute power. I like how the authors grounded their discussions in specific GPU architectures and showed how the architecture was evolving from one generation to the next. I'm sure the newer Pascal, Volta, and Turing architectures provide more advanced and flexible features, but the book builds a great foundation. Chapter 2 also contains the clearest definition of a kernel that I've seen, yet:

A kernel function is the code to be executed on the device side. In a kernel function, you define the computation for a single thread, and the data access for that thread. When the kernel is called, many different CUDA threads perform the same computation in parallel.

This explanation is the essence of the paradigm shift from sequential to parallel programming, and it's important to understand the effect it has on the code that you write and how it runs on the hardware. In addition to the excellent writing, each chapter has some nice exercises at the end. That's not normally something you find in programming books like this. Exercises seem to be left to textbooks, like Programming Massively Parallel Processors, which had them as well, but in Professional CUDA C Programming they're more well-conceived and more relevant.

The next chapter covers the CUDA execution model, or how the code runs on the real hardware. Here is where we learn how to optimize CUDA programs to take advantage of all of those independent compute cores on the GPU, and this chapter even gets into dynamic parallelism earlier in the book rather than waiting and treating it as a special topic like the last book did.

Chapter 4 covers global memory and chapter 5 looks at shared and constant memory. Understanding the trade-offs of each of these memories is important to getting the maximum performance out of the GPU because most often these programs are memory-bound, not compute-bound. Like everything else, the authors do an excellent job explaining the memory hierarchy and how those trade-offs affect CUDA programs. The examples used throughout the book are simple so that the reader doesn't get bogged down trying to understand unrelated algorithm details. The more complex examples may be thought-provoking, but simple examples do a good job of showcasing the specifics of the topic at hand.

Chapter 6 addresses streams and events, which are used to overlap computation with data transfer. Using streams can partially, or in some cases completely hide the time it takes to get the data into the GPU memory. Chapter 7 explains more optimization techniques by using CUDA instruction-level primitives to directly control how computations are performed on the GPU. These instructions trade some accuracy for speed, and they should be used only if the accuracy is not critical to the application. The authors do a good job of explaining all of the implications here.

The last three chapters weren't as interesting to me, not because I was tired of the book this time, but because they were about the same topics that I skipped in the other CUDA books: OpenACC, multi-GPU programming, and the CUDA development process. The rest of the book was excellent, and far better than the other two CUDA books I read. The writing is clear with plenty of diagrams for better understanding of each topic, and the book organization is done well. If you're interested in GPU programming and want to read one book, this one is it.

Between these two CUDA books, the choice is obvious. Programming Massively Parallel Processors was a bloated dud. It may be worth it just for the large set of example programs it contains, but there are other options coming down the pipeline for that kind of cookbook that may be better. Professional CUDA C Programming was better in every way, and really the book to get for learning CUDA programming. The authors did a great job of explaining complex topics in GPU architecture with concise, understandable writing, relevant diagrams, and appropriate exercises for practice. It's exactly the kind of book I want for learning a new programming language, or in this case, programming paradigm. If you're at all interested in CUDA programming, it's worth checking out.

Saturday, April 11, 2020

Oceanhorn 2: Development Update


All the roads lead to Capital

It has been too long since we gave you guys an update on the development of Oceanhorn 2: Knights of the Lost Realm. Well, all five of us have been focusing on the game, and when you're really concentrated on your work, the time flies!

Grand Core: The city sized machine in the heart of Capital.

It feels like it was only yesterday when we released our first gameplay video of Oceanhorn 2 and even let people play the game at Nordic Game, but the truth is it has been months! The reception of the gameplay video was awesome. We could not get enough of all the impressions from youtube and forums. The excitement of our audience really inspired us to work harder. After we came back to our studio, we were determined to move to the next step with the production!

Capital's seedier backside

So, what have we been up to? We have been building an adventure! More gameplay, more story, more levels, more worlds. A city. Capital is one of the central locations of Oceanhorn 2's story and it offers tons of open-ended exploration for curious adventurers. In the heart of the city is the gigantic machine Grand Core.

Meeting with friends at Master Mayfair's penthouse study

We hope to bring you more updates in the future, along with some video teasers to give you a real good look at the game! Productions like this require a lot of time and effort, but the outcome will be a cool video game so it is all worth the trouble! When you are making your dream project, you don't count the hours.

Enjoy these iPhone screenshots folks!

Thursday, April 9, 2020

More Reaper Bones 4 Minis

More Reaper Bones 4 minis...

Thunderfoot Defender

Rear of same

Pack of Velociraptors.

Wednesday, April 1, 2020

Re:无法参展，无法国外访客？我们来帮助您在线获取精准客户!

近期疫情期间，导致很多厂家还没开始生产，国外出现大批量的缺货浪潮

在这个缺货的机遇，就看您有没有捉住。

我们来帮你解决这个问题。惊喜分析您产品的海外客户群体是哪些，精准推荐客户群体给您，

自动Ai分析采购，自动EDM精准email营销，自动回流标记高质量客户资源。

企鹅：3290447103 免费为您在线演示。

微信: maoxiaoqi6688

点击此处退订

Sunday, March 29, 2020

People Behind The Meeples - Episode 215: Ammon Anderson

Welcome to People Behind the Meeples, a series of interviews with indie game designers. Here you'll find out more than you ever wanted to know about the people who make the best games that you may or may not have heard of before. If you'd like to be featured, head over to http://gjjgames.blogspot.com/p/game-designer-interview-questionnaire.html and fill out the questionnaire! You can find all the interviews here: People Behind the Meeples. Support me on Patreon!

Name:	Ammon Anderson
Email:	Ammonanderson@gmail.com
Location:	Utah, USA
Day Job:	I'm a full time artist.
Designing:	Two to five years.
Webpage:	Tacosforever.org
Facebook:	Facebook.com/tacothegame
Instagram:	@tacothegame
Find my games at:	My website

Today's Interview is with:

Ammon Anderson
Interviewed on: 3/8/2020

This week I actually have two interviews coming out. The first interview is with designer Ammon Anderson. Ammon has been working on his game T.A.C.O. for a while now and plans to launch it on Kickstarter very soon. T.A.C.O. is a party game about building the best taco, while messing up your opponents' recipes. So be sure to check out T.A.C.O. on Kickstarter in the next couple of weeks (it should have been live this week, but COVID-19 has delayed things a bit) and read on to learn more about Ammon and his other projects!

Some Basics
Tell me a bit about yourself.

How long have you been designing tabletop games?
Two to five years.

Why did you start designing tabletop games?
I've been designing games since I was a kid. But I've only been seriously designing games this past year and I'm working on my second already. I love it.

What game or games are you currently working on?
I am launching TACO, and am developing a board game called MOLD.

Have you designed any games that have been published?
Not yet. Soon:)

What is your day job?
I'm a full time artist.

Your Gaming Tastes
My readers would like to know more about you as a gamer.

Where do you prefer to play games?
I love all sorts of games. Right now my kids and I have been playing colt express a lot.

Who do you normally game with?
Friends, family, and a board gaming group in Utah.

If you were to invite a few friends together for game night tonight, what games would you play?
I love Carcassonne. I can't help it. It was my introduction to strategic board gaming.

And what snacks would you eat?
Pizza:)

Do you like to have music playing while you play games? If so, what kind?
I've never played with music on. That may be distracting to me.

What's your favorite FLGS?
Game Grid Lehi.

What is your current favorite game? Least favorite that you still enjoy? Worst game you ever played?
Current favorite is colt express. I Really enjoy the laughter, setting a bunch of actions put in place, and then watching it play out. Least favorite? Tikal. It's so LONG, But it's mesmerizing. worst game I ever played? Worst: trivial pursuit. I HATE that game. And so many people love it.

What is your favorite game mechanic? How about your least favorite?
Favorite is strategic tile placement games. Least favorite is luck. Like exploding kittens. I don't like grenade in the deck games.

What's your favorite game that you just can't ever seem to get to the table?
Star Realms. It's hard to find people who want to play it.

What styles of games do you play?
I like to play Board Games, Card Games, Miniatures Games, RPG Games, Video Games

Do you design different styles of games than what you play?
I like to design Board Games, Card Games, Miniatures Games

OK, here's a pretty polarizing game. Do you like and play Cards Against Humanity?
No

You as a Designer
OK, now the bit that sets you apart from the typical gamer. Let's find out about you as a game designer.

When you design games, do you come up with a theme first and build the mechanics around that? Or do you come up with mechanics and then add a theme? Or something else?
It's kind of a combination. But mold is definitely a game that spawned from the name. The game mechanics naturally developed from the idea that mold is pretty amazing.

Have you ever entered or won a game design competition?
No.

Do you have a current favorite game designer or idol?
My friend Travis Hancock at facade games. He's doing awesome things

Where or when or how do you get your inspiration or come up with your best ideas?
Just bouncing ideas off of my fiancé. Our conversations bounce back and forth and the ideas just grow.

How do you go about playtesting your games?
I play with my fiancé Mel, and my kids and family and a lot of friends. Then I open it up to fans of my art.

Do you like to work alone or as part of a team? Co-designers, artists, etc.?
So far, I only work alone and Mel refines my ideas.

What do you feel is your biggest challenge as a game designer?
I love some of my ideas so much but others cannot always grasp them. I have to kill a lot of those little darling ideas.

If you could design a game within any IP, what would it be?
Facade games.

What do you wish someone had told you a long time ago about designing games?
How much I would adore it. So many people warn you not to waste your time. But that is crap. I've never been happier.

What advice would you like to share about designing games?
Follow your passions. It's the oddest ideas that make the best games. Experiencing something new.

Would you like to tell my readers what games you're working on and how far along they are?
I'm planning to crowdfund: Taco
Games that are in the early stages of development and beta testing are: Mold

Are you a member of any Facebook or other design groups? (Game Maker's Lab, Card and Board Game Developers Guild, etc.)
Most of them

And the oddly personal, but harmless stuff…
OK, enough of the game stuff, let's find out what really makes you tick! These are the questions that I'm sure are on everyone's minds!

Star Trek or Star Wars? Coke or Pepsi? VHS or Betamax?
Both, coke, Blu-ray

What hobbies do you have besides tabletop games?
Fine art. Raising 3 amazing kids as a single dad.

What is something you learned in the last week?
I learned how to make a killer salad that I actually crave every single day.

Favorite type of music? Books? Movies?
Audiobooks. All sorts. I read every genre except romance. Favorite is Brandon sanderson. Movies. Yes. Everything. And Netflix is my addiction.

What was the last book you read?
Eye of the world. For the 6th time.

Do you play any musical instruments?
Nope:( regret.

Tell us something about yourself that you think might surprise people.
I illustrated 80 cards for TACO in 3 weeks.

Tell us about something crazy that you once did.
I eloped. It was CRAZY. and I wouldn't recommend it. Lol

Biggest accident that turned out awesome?
I was laid off work 18 months ago. I thought it was a disaster. It's been the single greatest blessing of my life and taken me down a totally new and wonderful road.

Who is your idol?
Brandon Sanderson. I LOVE how he thinks and what he creates.

What would you do if you had a time machine?
I'd hide it. Nobody is going to screw up history. I mean look at how we've screwed up the world!? I doubt anybody is smart enough to go back and "fix things"

Are you an extrovert or introvert?
Extrovert.

If you could be any superhero, which one would you be?
Uh... Superman. He's. SUPERMAN.

Have any pets?
Not currently. I miss having a dog.

When the next asteroid hits Earth, causing the Yellowstone caldera to explode, California to fall into the ocean, the sea levels to rise, and the next ice age to set in, what current games or other pastimes do you think (or hope) will survive into the next era of human civilization? What do you hope is underneath that asteroid to be wiped out of the human consciousness forever?
Lol. Mine! Haha. And Carcassonne.

If you'd like to send a shout out to anyone, anyone at all, here's your chance (I can't guarantee they'll read this though):
My mom. She taught me how to be creative. And my dad. He taught me to be grounded.

Thanks for answering all my crazy questions!

Thank you for reading this People Behind the Meeples indie game designer interview! You can find all the interviews here: People Behind the Meeples and if you'd like to be featured yourself, you can fill out the questionnaire here: http://gjjgames.blogspot.com/p/game-designer-interview-questionnaire.html

Did you like this interview? Please show your support: Support me on Patreon! Or click the heart at Board Game Links

, like GJJ Games on Facebook

, or follow on Twitter

. And be sure to check out my games on Tabletop Generation.

闲言碎语(彭嘉佑)

Sunday, April 12, 2020